Home > Net >  Python pandas group non repeating values
Python pandas group non repeating values

Time:03-29

Hi I have a data frame which looks like this

         col1     col2
    0      A       1
    1      B       2
    2      C       3
    3      A       4
    4      C       5
    5      A       6

I would like to groupby and sum for non repeating values in col1 for e.g.

A,B,C => 6
A,C => 9
A => 6

Is there any way I can do this via pandas functions?

CodePudding user response:

IIUC, you could create groups using groupby cumcount (where the nth occurrences of each col1 value will be grouped the same); then groupby the groups and join "col1"s and sum "col2"s:

out = df.groupby(df.groupby('col1').cumcount()).agg({'col1':','.join, 'col2':'sum'})

Output:

    col1  col2
0  A,B,C     6
1    A,C     9
2      A     6
  • Related