Home > front end >  Pandas Merge and Complete rows with same id
Pandas Merge and Complete rows with same id

Time:05-20

Here is an extract of my dataframe :

ID LU MA ME JE VE SA DI
200 B B B
201 C C C C C
211 A
211 D D D
211 B
213 A A
216 K K K K
216 K
217 B B B B B

I have some rows with same ID and want to "merge" them into only one row while completing them. Here is an example of what I want to have as a result :

ID LU MA ME JE VE SA DI
200 B B B
201 C C C C C
211 A D D D B
213 A A
216 K K K K K
217 B B B B B

I'm new to pandas dataframes and have try to use drop_duplicates method but I need something different because of the restriction on keep parameters. Also the dataframe is sorted by ID.

CodePudding user response:

If there is only one non empty value per groups use:

df = df.replace('',np.nan).groupby('ID', as_index=False).first().fillna('')

If possible multiple values and need unique values in original order use lambda function:

print (df)
    ID LU MA ME JE VE SA DI
0  201  B     C  B         
1  201  C  C  C  B  C    


f = lambda x: ','.join(dict.fromkeys(x.dropna()).keys())
df = df.replace('',np.nan).groupby('ID', as_index=False).agg(f)
print (df)
    ID   LU MA ME JE VE SA DI
0  201  B,C  C  C  B  C      

CodePudding user response:

This could be treated as a pivot. You'd need to melt the df first then pivot:

(df.melt(id_vars='ID')
   .dropna()
   .pivot(index='ID',columns='variable',values='value')
   .fillna('')
   .rename_axis(None, axis=1)
   .reset_index()
)
  • Related