Is there a way to get original column back from column which is a cumsum() of the original column?
For example:
df = pd.DataFrame({'Original': [1, 0, 0, 1, 0, 5, 0, np.NaN, np.NaN,4, 0, 0],
'CumSum': [1, 1, 1, 2, 2, 7, 7, np.NaN, np.NaN, 11, 11, 11]})
In the above example df, Is it possible to get original column just using the CumSum column?
In my original dataset, I have a column similar to CumSum column and I want to get the original. I tried to find an inbuilt function that can do but haven't found any.
CodePudding user response:
You can use:
df['Original2'] = (df['CumSum'].ffill().diff()
.mask(df['CumSum'].isna())
.fillna(df['CumSum'])
)
Output:
Original CumSum Original2
0 1.0 1.0 1.0
1 0.0 1.0 0.0
2 0.0 1.0 0.0
3 1.0 2.0 1.0
4 0.0 2.0 0.0
5 5.0 7.0 5.0
6 0.0 7.0 0.0
7 NaN NaN NaN
8 NaN NaN NaN
9 4.0 11.0 4.0
10 0.0 11.0 0.0
11 0.0 11.0 0.0