I have below dataframe.
**Col1** **Col2**
ABC S1
ABC S2
BCD S3
FJK S5
XYZ S7
XYZ S8
I need output in the below format.
data = {'ABC': '[S1,S2]', 'BCD': '[S3]', 'FJK': '[S5]', 'XYZ': '[S7,S8]'}
Can anyone please help me how to achieve above output, it will be great !
CodePudding user response:
Like mentioned commnent aggregate list
and then convert to dictionary:
d = df.groupby('Col1').Col2.agg(list).to_dict()
print (d)
{'ABC': ['S1', 'S2'], 'BCD': ['S3'], 'FJK': ['S5'], 'XYZ': ['S7', 'S8']}
For strings use f-strings
in lambda function:
d = df.groupby('Col1').Col2.agg(lambda x: f"[{','.join(x)}]").to_dict()
print (d)
{'ABC': '[S1,S2]', 'BCD': '[S3]', 'FJK': '[S5]', 'XYZ': '[S7,S8]'}
For json use Series.to_json
:
j = df.groupby('Col1').Col2.agg(list).to_json()
print (j)
{"ABC":["S1","S2"],"BCD":["S3"],"FJK":["S5"],"XYZ":["S7","S8"]}
CodePudding user response:
If you really want a string as value, use aggregation as string:
data = ('[' df.groupby('Col1')['Col2'].agg(','.join) ']').to_dict()
Output:
{'ABC': '[S1,S2]',
'BCD': '[S3]',
'FJK': '[S5]',
'XYZ': '[S7,S8]'}