Home > Blockchain >  How to format pandas dataframe in order to just keep one row per id?
How to format pandas dataframe in order to just keep one row per id?

Time:10-08

I have such a pandas dataframe:

id feature1 feature feature3
A 1 0 0
A 0 1 0
B 0 1 0
B 1 0 0
B 0 0 1
C 0 0 1

So, this means this is a one hot encoded dataframe. I would like to get it now in a another format in order to just have one row per id:

id feature1 feature feature3
A 1 1 0
B 1 1 1
C 0 0 1

How can I do this?

CodePudding user response:

Use groupby.max:

out = df.groupby('id', as_index=False).max()

output:

  id  feature1  feature  feature3
0  A         1        1         0
1  B         1        1         1
2  C         0        0         1
  • Related