I'm looking for a simplier way to create a new column that represents grouping a dataframe by ColA, and for each date in ColB, sum the previous two values in ColC. Example below:
Original DF
ColA ColB ColC
dog 10/13/2021 2
dog 10/14/2021 1
dog 10/15/2021 8
cat 10/12/2021 2
cat 10/13/2021 5
cat 10/14/2021 6
cat 10/15/2021 1
frog 10/12/2021 6
frog 10/13/2021 9
frog 10/14/2021 2
frog 10/15/2021 4
Desired DF
ColA ColB ColC NewCol
dog 10/13/2021 2 NaN
dog 10/14/2021 1 2
dog 10/15/2021 8 3
cat 10/12/2021 2 NaN
cat 10/13/2021 5 2
cat 10/14/2021 6 7
cat 10/15/2021 1 11
frog 10/12/2021 6 NaN
frog 10/13/2021 9 6
frog 10/14/2021 2 15
frog 10/15/2021 4 11
I know I could do this with looping through the ColA categories and making subset dataframes, etc, etc but would be several lines of code. Any one have a speedier option/idea? thanks.
CodePudding user response:
Use lambda function with rolling
and shift
like:
f = lambda x: x.rolling(2, min_periods=1).sum().shift()
df['new'] = df.groupby('ColA')['ColC'].apply(f).reset_index(level=0, drop=True)
print (df)
ColA ColB ColC new
0 dog 10/13/2021 2 NaN
1 dog 10/14/2021 1 2.0
2 dog 10/15/2021 8 3.0
3 cat 10/12/2021 2 NaN
4 cat 10/13/2021 5 2.0
5 cat 10/14/2021 6 7.0
6 cat 10/15/2021 1 11.0
7 frog 10/12/2021 6 NaN
8 frog 10/13/2021 9 6.0
9 frog 10/14/2021 2 15.0
10 frog 10/15/2021 4 11.0
Or with double groupby
like:
df['new'] = (df.groupby('ColA')['ColC'].rolling(2, min_periods=1).sum()
.groupby(level=0).shift()
.reset_index(level=0, drop=True))
print (df)
ColA ColB ColC new
0 dog 10/13/2021 2 NaN
1 dog 10/14/2021 1 2.0
2 dog 10/15/2021 8 3.0
3 cat 10/12/2021 2 NaN
4 cat 10/13/2021 5 2.0
5 cat 10/14/2021 6 7.0
6 cat 10/15/2021 1 11.0
7 frog 10/12/2021 6 NaN
8 frog 10/13/2021 9 6.0
9 frog 10/14/2021 2 15.0
10 frog 10/15/2021 4 11.0
CodePudding user response:
Try rolling with lazy groupby:
g = df.groupby('ColA')
df['ColD'] = (g['ColC'].rolling(2, min_period=1).sum()
.reset_index(level=0, drop=True)
)
df['ColD'] = g['ColD'].shift()
Output:
ColA ColB ColC ColD
0 dog 10/13/2021 2 NaN
1 dog 10/14/2021 1 2.0
2 dog 10/15/2021 8 3.0
3 cat 10/12/2021 2 NaN
4 cat 10/13/2021 5 2.0
5 cat 10/14/2021 6 7.0
6 cat 10/15/2021 1 11.0
7 frog 10/12/2021 6 NaN
8 frog 10/13/2021 9 6.0
9 frog 10/14/2021 2 15.0
10 frog 10/15/2021 4 11.0