I have a pandas dataframe df
.
There's a column "a"
. I need to compute a column "b"
which is a cumulative sum of "a" with an offset of 1 row.
So it's something like
df["b"][0] = 0
for i in len(df["a"]) - 1:
df["b"][i 1] = df["b"][i] df["a"][i]
I am wondering if there's a built in function that will allow me to this without the for loop?
Here's an example with numbers:
df = {'a': [1, 2, 3, 4]}
After the above algorithm we should end up with
df = {'a': [1, 2, 3, 4], 'b': [0, 1, 3, 6]}
CodePudding user response:
You can use pandas.Series.cumsum
with pandas.Series.shift
:
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 3, 4]})
df["b"] = df["a"].cumsum().shift(periods=1).fillna(0).astype(int)
# Output :
print(df)
a b
0 1 0
1 2 1
2 3 3
3 4 6
CodePudding user response:
IIUC, you just need:
df["b"] = df["a"].shift(fill_value=0).cumsum()
print(df):
a b
0 1 0
1 2 1
2 3 3
3 4 6