Pandas Groupby - Append lists-CodePudding

My pandas DataFrame has a column with lists dtype. I'd like to Group By and aggregate the DataFrame and append the lists.

Here's a sample DataFrame:

import pandas as pd

df = pd.DataFrame({
                   'id': [1, 1, 2],
                   'cat': ['A','A','B'],
                   'lst': [['l0','l1','l2'],['l3','l4'],['lb']],
                   'v': [10, 20, 10]
                 })

Use mean to aggregate column v.

Expected output:

id  cat lst                         v

1   A   ['l0','l1','l2','l3','l4']  15
2   B   ['lb']                      10

CodePudding user response：

A simple way would be to aggregate the lst column using sum and v using mean:

df.groupby(['id', 'cat'], as_index=False).agg({'lst': 'sum', 'v': 'mean'})

   id cat                   lst     v
0   1   A  [l0, l1, l2, l3, l4]  15.0
1   2   B                  [lb]  10.0

CodePudding user response：

This works

# groupby and call lambda that flattens a nested list on lst and mean on v
df.groupby(['id', 'cat'], as_index=False).agg({'lst': lambda lst: [x for s_l in lst for x in s_l], 'v':'mean'})
   id cat                   lst     v
0   1   A  [l0, l1, l2, l3, l4]  15.0
1   2   B                  [lb]  10.0