I have a pandas dataframe like this:
Column1 | Column2 | Column3 |
---|---|---|
a | k | x |
a | l | y |
b | k | z |
I want to transform this dataframe to this:
Column1 | Column2 | Column3 |
---|---|---|
a | "k,l" | "x,y" |
b | k | z |
I found similar examples but couldn't find an exact solution to my problem. Thank you so much for your help!
CodePudding user response:
Try groupby
then agg
df_ = (df.groupby(['Column1'])
.agg({'Column2': lambda x: ','.join(x), 'Column3': lambda x: ','.join(x)})
.reset_index()
)
print(df_)
Column1 Column2 Column3
0 a k,l x,y
1 b k z
If you need the quote mark
df_ = (df.groupby(['Column1'])
.agg({'Column2': lambda x: f'"{",".join(x)}"' if len(x)>1 else x,
'Column3': lambda x: f'"{",".join(x)}"' if len(x)>1 else x})
.reset_index()
)
Column1 Column2 Column3
0 a "k,l" "x,y"
1 b k z
CodePudding user response:
You can do it with groupby
and agg
:
df.groupby(["Column1"], as_index=False).agg(lambda x: ",".join(x))
EDIT
Just found out lambda
is not even needed here:
df.groupby(["Column1"], as_index=False).agg(",".join)
Output:
Column1 Column2 Column3
0 a k,l x,y
1 b k z
CodePudding user response:
You may want to use a custom function that joins the strings exploiting pandas.DataFrame.transform
:
df.groupby(['Column1']).transform(lambda val: ','.join(val))