In group 3, I want to get the max value of group 1
In group 5, I want to get the max value of group 3
Input:
import pandas as pd
A=[20,13,15,25,24,13,14,19,13,11]
group=[1,1,2,2,2,3,3,4,4,5]
df=pd.DataFrame({'A':A,'group':group})
Expected Output
A group g_max g-2_max
0 20 1 20
1 13 1 20
2 15 2 25
3 25 2 25
4 24 2 25
5 13 3 14 20
6 14 3 14 20
7 19 4 19 25
8 13 4 19 25
9 11 5 11 14
CodePudding user response:
One way to go, would be as follows:
df['g_max'] = df.groupby('group')['A'].transform('max')
df['g-2_max'] = df.group.apply(lambda x: df.g_max[df.group == x-2].max())
print(df)
A group g_max g-2_max
0 20 1 20 NaN
1 13 1 20 NaN
2 15 2 25 NaN
3 25 2 25 NaN
4 24 2 25 NaN
5 13 3 14 20.0
6 14 3 14 20.0
7 19 4 19 25.0
8 13 4 19 25.0
9 11 5 11 14.0
If the values in group
are consecutive, another way to get g-2_max
could be:
s = df.groupby('group')['g_max'].max().shift(2)
s.name = 'g-2_max'
df = pd.merge(df, s, on='group')
CodePudding user response:
One approach, using a custom function to store the previous results in a deque
:
from collections import deque
def prev_max(g, d=deque(maxlen=2)):
res = np.nan
if len(d) == d.maxlen:
res = d.popleft()
d.append(g.max())
return res
df["g-2_max"] = df.groupby("group")["A"].transform(prev_max)
print(df)
Output
A group g-2_max
0 20 1 NaN
1 13 1 NaN
2 15 2 NaN
3 25 2 NaN
4 24 2 NaN
5 13 3 20.0
6 14 3 20.0
7 19 4 25.0
8 13 4 25.0
9 11 5 14.0