Home > Software engineering >  Merge rows with pandas and append the different columns
Merge rows with pandas and append the different columns

Time:01-11

I have the following dataframe :

CONTACT_ID COMPANY_ID FIRST_NAME SURNAME EMAIL COMPANY_NAME
1 8 Martin GERARD [email protected] BIG COMPANY
1 87 Martin GERARD [email protected] LITTLE COMPANY
30 12 Jean DUPOND [email protected] MEGA COMPANY

I'd like to use pandas to convert it like so :

CONTACT_ID FIRST_NAME SURNAME EMAIL COMPANY_ID_1 COMPANY_NAME_1 COMPANY_ID_2 COMPANY_NAME_2
1 Martin GERARD [email protected] 8 BIGCOMPANY 87 LITTLECOMPANY
30 Jean DUPOND [email protected] 12 MEGA COMPANY

One contact can even have more than two companies.

CodePudding user response:

Create list of columns used for create groups, then converte them to index by DataFrame.set_index and counter by GroupBy.cumcount, reshape by DataFrame.unstack with DataFrame.sort_index, flatten MultiIndex and last convert levels from list to columns:

cols = ['CONTACT_ID','FIRST_NAME','SURNAME','EMAIL']

df = (df.set_index([*cols, df.groupby(cols).cumcount().add(1)])
        .unstack(fill_value='')
        .sort_index(axis=1, level=1,sort_remaining=False))

df.columns = df.columns.map(lambda x: f'{x[0]}_{x[1]}')
df = df.reset_index()
print (df)
   CONTACT_ID FIRST_NAME SURNAME                    EMAIL COMPANY_ID_1  \
0           1     Martin  GERARD  [email protected]            8   
1          30       Jean  DUPOND      [email protected]           12   

  COMPANY_NAME_1 COMPANY_ID_2  COMPANY_NAME_2  
0    BIG COMPANY           87  LITTLE COMPANY  
1   MEGA COMPANY      
  • Related