I have a data frame like this:
temp_id | administ_div | municipality | combine |
---|---|---|---|
1 | Gunma | Maebashi | |
NaN | Gunma | Maebashi | CC |
NaN | Gunma | Maebashi | DD |
NaN | Gunma | Maebashi | EE |
2 | Gunma | Kiryu | AA |
How do I group by administ_div
, municipality
to this (temp_id must be not NaN):
temp_id | administ_div | municipality | combine |
---|---|---|---|
1 | Gunma | Maebashi | [CC,DD,EE] |
2 | Gunma | Kiryu | AA |
CodePudding user response:
You can use a custom groupby.agg
with help of ffill
:
# aggregate all columns with first value
agg_funcs = {c: 'first' for c in df}
# except "combine" as list of non-null values
agg_funcs['combine'] = lambda l: [x for x in l if x]
# group by filled values and aggregate
out = (df
.groupby(df['temp_id'].ffill(), as_index=False)
.agg(agg_funcs)
)
output:
temp_id administ_div municipality combine
0 1.0 Gunma Maebashi [CC, DD, EE]
1 2.0 Gunma Kiryu [AA]
CodePudding user response:
try:
df['temp_id'] = df['temp_id'].ffill()
df1 = df.groupby(['temp_id', 'administ_div', 'municipality'])['combine'].apply(lambda x: [i for i in x if not pd.isna(i)][0] if len(x)==1 else [i for i in x if not pd.isna(i)]).reset_index()
df1
temp_id administ_div municipality combine
0 1.0 Gunma Maebashi [CC, DD, EE]
1 2.0 Gunma Kiryu AA
CodePudding user response:
A solution with pandas.DataFrame.groupby
and pandas.Series.dropna
combined :
new_df = (
df.groupby(['administ_div', 'municipality'], as_index=False)
.agg(lambda x: list(x.dropna()) if len(x)>1 else x)
.assign(temp_id = lambda y: y['temp_id'].astype(str).str.strip('[]').astype(float).astype(int))
.reindex(columns=df.columns)
.sort_values(by='temp_id')
)
# Output :
print(new_df)
temp_id administ_div municipality combine
1 1 Gunma Maebashi [CC, DD, EE]
0 2 Gunma Kiryu AA
CodePudding user response:
out = (df.assign(temp_id=df.temp_id.ffill())
.groupby(['temp_id', 'administ_div', 'municipality'], as_index=False)['combine']
.agg(lambda x: x.dropna().tolist()))
print(out)
Output:
temp_id administ_div municipality combine
0 1.0 Gunma Maebashi [CC, DD, EE]
1 2.0 Gunma Kiryu [AA]