I have the dataframe:
df = A B l1 l2 l3
1 1 2 3 4
1 1 3 5 7
1 1 1 2 9
1 2 2 7 8
I want to groupby A,B , per columns, and put the values as a series in a cell. So the output will be:
df = A B l1 l2 l3
1 1 2,3,1 3,5,2 4,7,9
1 2 2 7 8
How can I do it? (efficiently)
Also, What is the solution of no ID columns? so
df = l1 l2 l3
2 3 4
3 5 7
1 2 9
2 7 8
and the output:
df = l1 l2 l3
2,3,1,2 3,5,2,7 4,7,9,8
CodePudding user response:
Use GroupBy.agg
with lambda function with cast to strings and join
:
df1 = df.groupby(['A','B']).agg(lambda x: ','.join(x.astype(str))).reset_index()
print (df1)
A B l1 l2 l3
0 1 1 2,3,1 3,5,2 4,7,9
1 1 2 2 7 8
For second:
df2 = df.astype(str).agg(','.join).to_frame().T
print (df2)
l1 l2 l3
0 2,3,1,2 3,5,2,7 4,7,9,8
If there are strings:
df1 = df.groupby(['A','B']).agg(','.join).reset_index()
df2 = df.agg(','.join).to_frame().T