Home > OS >  Summarising pandas data frame by multiple fields and collapsing into a single column
Summarising pandas data frame by multiple fields and collapsing into a single column

Time:11-18

I am trying to group and summarise a pandas dataframe into a single column

ID LayerName Name Count
A SC B 2
A SC R 8
A BLD S 7
A BLD K 6

I will like the resulting table to be summarised by the LayerName, Name and Count into a single output field like thi

ID Output
A 10 - SC : (B,R) ; 13 - BLD : (S,K)

CodePudding user response:

You need a double groupby.agg:

(df.groupby(['ID', 'LayerName'],
            as_index=False, sort=False)
   .agg({'Name': ','.join, 'Count': 'sum'})
   .assign(Output=lambda d: d['Count'].astype(str)
                            ' - ' d['LayerName']
                            ' : (' d['Name'] ')')
   .groupby('ID', as_index=False, sort=False)
   .agg({'Output': ' ; '.join})
)

Output:

  ID                              Output
0  A  10 - SC : (B,R) ; 13 - BLD : (S,K)

CodePudding user response:

df.groupby(["ID", "LayerName"], sort=False).\
apply(lambda x: f"{x.Count.sum()} - {x.LayerName.iloc[0]}: ({','.join(x.Name.to_list())})").\
str.cat(sep="; ")
# '10 - SC: (B,R); 13 - BLD: (S,K)'
  • Related