Hi I have a data frame which looks like this
col1 col2
0 A 1
1 B 2
2 C 3
3 A 4
4 C 5
5 A 6
I would like to groupby and sum for non repeating values in col1 for e.g.
A,B,C => 6
A,C => 9
A => 6
Is there any way I can do this via pandas functions?
CodePudding user response:
IIUC, you could create groups using groupby
cumcount
(where the nth occurrences of each col1
value will be grouped the same); then groupby the groups and join
"col1"s and sum
"col2"s:
out = df.groupby(df.groupby('col1').cumcount()).agg({'col1':','.join, 'col2':'sum'})
Output:
col1 col2
0 A,B,C 6
1 A,C 9
2 A 6