If my dataframe is like this,
X———Y———Z
1———a
1———b
2———c
the output should be
X————-Y—————-Z
1————-a—————-a,b
1—————b——————a,b
2————-c
Condition:
If a X has duplicates then it should all take the values of Y of that X duplicate and convert to csv values and paste in column Z
(ignore the line in between)
CodePudding user response:
df["Z"] = df.X.map(df.groupby("X").agg(list).apply(lambda x: "" if len(x.Y) == 1 else ",".join(x.Y), axis=1))
CodePudding user response:
Use a groupby.transform
and mask
:
g = df.groupby('X')['Y']
df['Z'] = g.transform(','.join).mask(g.transform('size')==1, '')
output:
X Y Z
0 1 a a,b
1 1 b a,b
2 2 c