Home > Software engineering >  Update columns with duplicate values from the DataFrame in Pandas
Update columns with duplicate values from the DataFrame in Pandas

Time:06-20

I have a data set which has values for different columns as different entries which has same name. For instance James's gender is in first row and James's age is in 5th row.

DataFrame df1=

Index First Name Age Gender Weight in lb Height in cm
0 James Male
1 John 175
2 Patricia 23
5 James 22
4 James 185
5 John 29
6 John 176

I am trying to make it combined into one DataFrame as below df1=

Index First Name Age Gender Weight Height
0 James 22 Male 185
1 John 29 175 176
2 Patricia 23

I tried to do groupby but it is not working.

CodePudding user response:

Assuming NaN in the empty cells, you can use groupby.first:

df.groupby('First Name', as_index=False).first()

output:

  First Name   Age Gender  Weight in lb  Height in cm
0      James  22.0   Male         185.0           NaN
1       John  29.0   None         175.0         176.0
2   Patricia  23.0   None           NaN           NaN
  • Related