Home > OS >  Cumulative sum of a pandas dataframe column without for loop?
Cumulative sum of a pandas dataframe column without for loop?

Time:12-03

I have a pandas dataframe df.

There's a column "a". I need to compute a column "b" which is a cumulative sum of "a" with an offset of 1 row.

So it's something like

df["b"][0] = 0

for i in len(df["a"]) - 1:
  df["b"][i   1] = df["b"][i]   df["a"][i]

I am wondering if there's a built in function that will allow me to this without the for loop?

Here's an example with numbers:

df = {'a': [1, 2, 3, 4]}

After the above algorithm we should end up with

df = {'a': [1, 2, 3, 4], 'b': [0, 1, 3, 6]}

CodePudding user response:

You can use pandas.Series.cumsum with pandas.Series.shift :

import pandas as pd

df = pd.DataFrame({'a': [1, 2, 3, 4]})

df["b"] = df["a"].cumsum().shift(periods=1).fillna(0).astype(int)

# Output :

print(df)

   a  b
0  1  0
1  2  1
2  3  3
3  4  6

CodePudding user response:

IIUC, you just need:

df["b"] = df["a"].shift(fill_value=0).cumsum()

print(df):

   a  b
0  1  0
1  2  1
2  3  3
3  4  6
  • Related