Home > OS >  Compute cumulative sum of negative and positive sequences
Compute cumulative sum of negative and positive sequences

Time:10-22

Here is my dataframe:

import pandas as pd
my_df = pd.DataFrame({'col_1': [1,2,5,6,4,3,7,8,9,11,14]})
my_df['diff'] = my_df['col_1'].diff()
my_df['cond_sum'] = [None, 1,4,5,-2,-3,4,5,6,8,11]

My original column is col_1. My goal is to obtain column cond_sum. diff column is difference between consecutive rows. cond_sum is cumulative sum of diff column which resets every time there is change in sign.

I can clearly write a function and do it on row by row basis, but I would like something faster.

CodePudding user response:

Get the sign of your diff, then create a groupby statement and calculate the cumsum

import numpy as np
import pandas as pd
my_df = pd.DataFrame({'col_1': [1,2,5,6,4,3,7,8,9,11,14]})
my_df['diff'] = my_df['col_1'].diff()

sign = np.sign(my_df['diff'])
groupby_ = (sign != sign.shift(1)).cumsum()
my_df['cond_sum'] = my_df.groupby(groupby_)['diff'].cumsum()
  • Related