Home > Mobile >  Cumulative Deviation of 2 Columns in Pandas DF
Cumulative Deviation of 2 Columns in Pandas DF

Time:01-14

I have a rather simple request and have not found a suitable solution online. I have a DF that looks like this below and I need to find the cumulative deviation as shown in a new column to the DF. My DF looks like this:

    year    month   Curr Yr LT Avg
0   2022    1   667590.5985 594474.2003
1   2022    2   701655.5967 585753.1173
2   2022    3   667260.5368 575550.6112
3   2022    4   795338.8914 562312.5309
4   2022    5   516510.1103 501330.4306
5   2022    6   465717.9192 418087.1358
6   2022    7   366100.4456 344854.2453
7   2022    8   355089.157  351539.9371
8   2022    9   468479.4396 496831.2979
9   2022    10  569234.4156 570767.1723
10  2022    11  719505.8569 594368.6991
11  2022    12  670304.78   576495.7539

And, I need the cumulative deviation new column in this DF to look like this:

Cum Dev
0.122993392
0.160154637
0.159888559
0.221628609
0.187604073
0.178089327
0.16687643
0.152866293
0.129326033
0.114260993
0.124487107
0.128058305

In Excel, the calculation would look like this with data in Excel columns Z3:Z14, AA3:AA14 for the first row: =SUM(Z$3:Z3)/SUM(AA$3:AA3)-1 and for the next row: =SUM(Z$3:Z4)/SUM(AA$3:AA4)-1 and for the next as follows with the last row looking like this in the Excel example: =SUM(Z$3:Z14)/SUM(AA$3:AA14)-1

Thank you kindly for your help,

CodePudding user response:

You can divide the cumulative sums of those 2 columns element-wise, and then subtract 1 at the end:

>>> (df["Curr Yr"].cumsum() / df["LT Avg"].cumsum()) - 1

0     0.122993
1     0.160155
2     0.159889
3     0.221629
4     0.187604
5     0.178089
6     0.166876
7     0.152866
8     0.129326
9     0.114261
10    0.124487
11    0.128058
dtype: float64
  • Related