I'm trying to perform this process, imagine to have the following and I want to obtain col4. :
Col1 | Col2 | Col3 | Col 4 | Col 5 |
---|---|---|---|---|
SF | 123 | QW | QW, BF | 1 |
SF | 456 | AF | AF | 2 |
SO | xxx | AF | AF, BF | 3 |
SO | yyy | GD | GD | 4 |
SF | 123 | BF | QW, BF | 1 |
RE | xxx | BF | AF, BF | 5 |
For the purpose of aggragation I'm using these 2 lines of code:
df[df['col1']!='SF'].groupby(['Col2']).agg({'Col3' : lambda x: ','.join(x.unique())})
df[df['col1']=='SF'].groupby(['Col2','Col5']).agg({'Col3':','.join})
But I don't know how to put them on df. I tried also a merge but didn't work. I only hope to have been clear!!
Thanks so much in advance
EDIT 1 Sorry for not being clear. Before to perform any line of code I have Col1, Col2, Col3, Col5. Col4 is the output I would like to obtain.
CodePudding user response:
You transform
instead of agg
to assign back to the original DataFrame:
df["Col4"] = df.groupby("Col2")["Col3"].transform(lambda x: ", ".join(x.unique()))