I have the following Pandas data frame:
id c1 c2
1 A B
1 C D
2 E F
2 G H
3 I J
3 K L
(IDs always occur in the same number respectively.) I want to "flatten" this DF by ID (concatenate rows to have a single row by ID) to get the outcome like this:
1 A B C D
2 E F G H
3 I J K L
How can I get there? (Column names of the output DF doesn't matter.)
CodePudding user response:
groupby
can be used, too.
>>> (
df.filter(like="c").groupby(df.id)
.apply(lambda f: pd.Series(f.values.flatten()))
)
0 1 2 3
id
1 A B C D
2 E F G H
3 I J K L
CodePudding user response:
You can use a pivot
:
(df.assign(col=df.groupby('id').cumcount())
.pivot(index='id', columns='col')
.sort_index(level=1, axis=1, sort_remaining=False)
#.pipe(lambda d: d.set_axis(range(d.shape[1]), axis=1))
)
Output:
c1 c2 c1 c2
col 0 0 1 1
id
1 A B C D
2 E F G H
3 I J K L
Alternative output (uncomment commented line):
0 1 2 3
id
1 A B C D
2 E F G H
3 I J K L