Home > Back-end >  Group by the all the columns except the first one, but aggregate as list the first column
Group by the all the columns except the first one, but aggregate as list the first column

Time:01-25

Let's say, I have this dataframe:

df = pd.DataFrame({'col_1': ['yes','no'], 'test_1':['a','b'], 'test_2':['a','b']})

What I want, is to group by all the columns except the first one and aggregate the results where the group by is the same.

This is what I'm trying:

col_names = df.columns.to_list()

df_out = df.groupby([col_names[1:]])[col_names[0]].agg(list)

This is my end data frame goal:

df = pd.DataFrame({'col_1': [['yes','no']], 'test_1':['a'], 'test_2':['b']})

And, if I have more rows, I want it to behave with the same principle, join in list the groups that are the same based on the column [1:] (from the second till end.

CodePudding user response:

Using pandas agg() method

df = df.groupby(df.columns.difference(["col_1"]).tolist()).agg(
    lambda x: x.tolist()).reset_index()
  • Related