Home > OS >  Vectorized version of sum of remaining rows
Vectorized version of sum of remaining rows

Time:10-19

In the small dataframe df I want to create a new variable 'y'.

'y' should be 'x' the sum of the remaining rows of the variable n. So for the first row I just want to use df.n.sum(). For the second row I want to use df.n.iloc[1:].sum(), and so on.

Can this be done vectorized?

import pandas as pd

df=pd.DataFrame({'n':[4,5,6,7,8,9],
                'x':[1,2,3,4,5,6]})

df['y'] = df.x   df.n.sum() #?

I can do this with a for loop and get the expected output.

Expected output:
output = [df.n.iloc[i:].sum() for i in range(len(df))]
print(output)

Output:
[39, 35, 30, 24, 17, 9]

CodePudding user response:

You want a reverse cumsum:

df['out'] = df.loc[::-1, 'n'].cumsum()

output:

   n  x  out
0  4  1   39
1  5  2   35
2  6  3   30
3  7  4   24
4  8  5   17
5  9  6    9
  • Related