Home > Blockchain >  Pandas multicolumn cumsum() on a subset of rows
Pandas multicolumn cumsum() on a subset of rows

Time:05-12

I can get this to work if I do the cumsum one column at a time. However doing it together outputs NaNs.

df = pd.DataFrame({"category": [1,2,2], "value1": [10,11,12], "value2": [20,21,22]})
m = df["category"] == 2
df.loc[m, ["cum_value1", "cum_value2"]] = df.loc[m, ["value1", "value2"]].cumsum()

CodePudding user response:

You can just assign new column

df[["cum_value1", "cum_value2"]] = df.loc[m, ["value1", "value2"]].cumsum()
df
Out[520]: 
   category  value1  value2  cum_value1  cum_value2
0         1      10      20         NaN         NaN
1         2      11      21        11.0        21.0
2         2      12      22        23.0        43.0
  • Related