suppose i have a df and i want to groupby 'item' and aggregate on min and max of thing1 but return thing2. so for item 'c' the min is 1 so 0 is returned and the max is 3 so 100 is returned
item | thing1 | thing2 |
---|---|---|
a | 1 | 10 |
a | 4 | 20 |
b | 1 | 30 |
c | 1 | 0 |
c | 2 | 10 |
c | 3 | 100 |
item | min_thing1 -> thing2 | max_thing1 -> thing2 |
---|---|---|
a | 10 | 20 |
b | 30 | 30 |
c | 0 | 100 |
I know i can aggregate min and max on thing1 by simply writing :
df.groupby('item').agg({'thing1' : [np.min, np.max]})
but how would i aggragate on thing2 by using min and max on thing1
CodePudding user response:
You can try something like this:
df.groupby('item')['thing1'].agg(['idxmin', 'idxmax']).stack().map(df['thing2']).unstack()
Output:
idxmin idxmax
item
a 10 20
b 30 30
c 0 100
CodePudding user response:
Can use min
/max
directly
>>> df.groupby('item').agg(lambda s: [s.min()['thing2'], s.max()['thing2']])
thing1 thing2
item
a 10 20
b 30 30
c 0 100
Can, of course, rename the columns after the operation.