Create Unique IDs using other ID column-CodePudding

I have a dataframe with the following 2 columns, the employee type, name, the column that identify the primary contract and its ID number. Like this one:

Name    Primary row?    Employee Type   ID
Paulo Cortez    Yes Employee    100000
Paulo Cortez    No  Employee    100000
Joan San    Yes Non-employee    100001
Felipe Castro   Yes Contractor  100002
Felipe Castro   No  Employee    100002
Felipe Castro   No  Contractor  100002

I need to create a sub ID column that takes the ID value and adds the first digit of the employee type in front (that may be Employee, Non-employee and Contractor). If the ID appears more than once, it needs to check the "Primary row?" column. If it says "Yes", just leave as the same format and for the othes that have "No" on it add a tag of "-2", "-3", etc as the following:

Name    Primary row?    Employee Type   ID  sub ID
Paulo Cortez    Yes Employee    100000  E100000
Paulo Cortez    No  Employee    100000  E100000-2
Joan San    Yes Non-employee    100001  N100001
Felipe Castro   Yes Contractor  100002  C100002
Felipe Castro   No  Employee    100002  E100002-2
Felipe Castro   No  Contractor  100002  E100002-3

What would be the best way to achieve this result?

CodePudding user response：

Here is one way to do it. First create a groupby with cumcount for the suffix if needed. Then apply each row and take add all the parts together.

df['sub_ID'] = df.groupby('ID').cumcount().add(1)

df['sub_ID'] = df.apply(lambda row: 
                        row['Employee Type'][0] 
                          str(row['ID']) 
                          ("" if row['Primary row?']=="Yes" else "-" str(row['sub_ID']))
                        ,axis=1)

Output df:

            Name Primary row? Employee Type      ID     sub_ID
0   Paulo Cortez          Yes      Employee  100000    E100000
1   Paulo Cortez           No      Employee  100000  E100000-2
2       Joan San          Yes  Non-employee  100001    N100001
3  Felipe Castro          Yes    Contractor  100002    C100002
4  Felipe Castro           No      Employee  100002  E100002-2
5  Felipe Castro           No    Contractor  100002  C100002-3