Home > database >  How to flatten a Pandas data frame per groupby in Python?
How to flatten a Pandas data frame per groupby in Python?

Time:08-24

I have the following Pandas data frame:

id  c1 c2
1   A  B
1   C  D
2   E  F
2   G  H
3   I  J
3   K  L 

(IDs always occur in the same number respectively.) I want to "flatten" this DF by ID (concatenate rows to have a single row by ID) to get the outcome like this:

1 A B C D
2 E F G H
3 I J K L

How can I get there? (Column names of the output DF doesn't matter.)

CodePudding user response:

groupby can be used, too.

>>> (
     df.filter(like="c").groupby(df.id)
       .apply(lambda f: pd.Series(f.values.flatten()))
    )
    0  1  2  3
id            
1   A  B  C  D
2   E  F  G  H
3   I  J  K  L

CodePudding user response:

You can use a pivot:

(df.assign(col=df.groupby('id').cumcount())
   .pivot(index='id', columns='col')
   .sort_index(level=1, axis=1, sort_remaining=False)
   #.pipe(lambda d: d.set_axis(range(d.shape[1]), axis=1))
 )

Output:

    c1 c2 c1 c2
col  0  0  1  1
id             
1    A  B  C  D
2    E  F  G  H
3    I  J  K  L

Alternative output (uncomment commented line):

    0  1  2  3
id            
1   A  B  C  D
2   E  F  G  H
3   I  J  K  L
  • Related