In the small dataframe df I want to create a new variable 'y'.
'y' should be 'x' the sum of the remaining rows of the variable n. So for the first row I just want to use df.n.sum(). For the second row I want to use df.n.iloc[1:].sum(), and so on.
Can this be done vectorized?
import pandas as pd
df=pd.DataFrame({'n':[4,5,6,7,8,9],
'x':[1,2,3,4,5,6]})
df['y'] = df.x df.n.sum() #?
I can do this with a for loop and get the expected output.
Expected output:
output = [df.n.iloc[i:].sum() for i in range(len(df))]
print(output)
Output:
[39, 35, 30, 24, 17, 9]
CodePudding user response:
You want a reverse cumsum
:
df['out'] = df.loc[::-1, 'n'].cumsum()
output:
n x out
0 4 1 39
1 5 2 35
2 6 3 30
3 7 4 24
4 8 5 17
5 9 6 9