Existing dataframe :
id sub-id amount
01 A 100
01 A 50
01 B 10
01 B 5
01 B 50
01 C 10
02 A 25
02 A 10
Expected Dataframe :
id sub-id A_max B_max C_max
01 A 100 50 10
02 A 25 0 0
Please note their can total 5 sub-id which may or may not be present for every id
with this code df.loc[df.groupby(['id','sub-id'])['amount'].idxmax()].reset_index(drop=True)
i can get the answer but i need it in the above expected format
please guide
@jezrael
adding up further if we have below table :
sub-id A_max B_max C_max
A 100 50 10
A 25 0 0
B 250 10 50
B 100 50 70
and expected output we need is :
sub-id A_max B_max C_max
A 100 50 10
B 250 50 70
how to get the same..?
CodePudding user response:
Use DataFrame.pivot_table
with aggregate max
and then add to sub-id
values - if need first value per id
use:
df2 = df.pivot_table(index='id',
columns='sub-id',
values='amount',
aggfunc='max', fill_value=0).add_suffix('_max')
df = df.groupby('id', as_index=False)['sub-id'].first().join(df2, on='id')
print (df)
id sub-id A_max B_max C_max
0 1 A 100 50 10
1 2 A 25 0 0
Or if need id
with maximal amount
use DataFrameGroupBy.idxmax
:
df2 = df.pivot_table(index='id',
columns='sub-id',
values='amount',
aggfunc='max', fill_value=0).add_suffix('_max')
df = df.loc[df.groupby('id')['amount'].idxmax(), ['id','sub-id']].join(df2, on='id')
print (df)
id sub-id A_max B_max C_max
0 1 A 100 50 10
6 2 A 25 0 0
EDIT: If need maximal values per sub-id
use:
df1 = df.groupby('sub-id', as_index=False).max()