According to this thread, when you want to apply the same aggregation function to multiple columns, you have to name columns. now consider a situation which i have many columns (30 columns for example). is there any way to do aggregation without naming columns? i mean is there any thing like this?
import pandas as pd
df = pd.DataFrame(...)
df.groupby('id').agg(lambda: col -> [sum(col) if col != id]);
CodePudding user response:
The solution you linked uses df.groupby('id')['x1', 'x2'].agg('sum')
.
So, to use every one of many columns but except a few ones:
columns_to_exclude = ['year', 'month' ,'day']
columns_to_aggregate = [col for col in df.columns if col not in columns_to_exclude]
df.groupby('id')[columns_to_aggregate].agg('sum')