Home > OS >  How to get the maximum value of a group in the past
How to get the maximum value of a group in the past

Time:07-24

In group 3, I want to get the max value of group 1

In group 5, I want to get the max value of group 3

Input:

import pandas as pd
A=[20,13,15,25,24,13,14,19,13,11]
group=[1,1,2,2,2,3,3,4,4,5]
df=pd.DataFrame({'A':A,'group':group})

Expected Output

    A   group   g_max   g-2_max
0   20  1         20    
1   13  1         20    
2   15  2         25    
3   25  2         25    
4   24  2         25    
5   13  3         14    20
6   14  3         14    20
7   19  4         19    25
8   13  4         19    25
9   11  5         11    14

CodePudding user response:

One way to go, would be as follows:

df['g_max'] = df.groupby('group')['A'].transform('max')
df['g-2_max'] = df.group.apply(lambda x: df.g_max[df.group == x-2].max())

print(df)

    A  group  g_max  g-2_max
0  20      1     20      NaN
1  13      1     20      NaN
2  15      2     25      NaN
3  25      2     25      NaN
4  24      2     25      NaN
5  13      3     14     20.0
6  14      3     14     20.0
7  19      4     19     25.0
8  13      4     19     25.0
9  11      5     11     14.0

If the values in group are consecutive, another way to get g-2_max could be:

s = df.groupby('group')['g_max'].max().shift(2)
s.name = 'g-2_max'
df = pd.merge(df, s, on='group')

CodePudding user response:

One approach, using a custom function to store the previous results in a deque:

from collections import deque

def prev_max(g, d=deque(maxlen=2)):
    res = np.nan
    if len(d) == d.maxlen:
        res = d.popleft()
    d.append(g.max())
    return res

df["g-2_max"] = df.groupby("group")["A"].transform(prev_max)
print(df)

Output

    A  group  g-2_max
0  20      1      NaN
1  13      1      NaN
2  15      2      NaN
3  25      2      NaN
4  24      2      NaN
5  13      3     20.0
6  14      3     20.0
7  19      4     25.0
8  13      4     25.0
9  11      5     14.0
  • Related