Sum percentages by condition-CodePudding

everyone. I don't know how to solve this. I would like to sum the percentages according to the result. Let me explain; all the % values for example with result i -1 should be summed (leaving the sign unchanged, so -1578 -0.987, etc., then when the sign changes, the value goes to zero and it starts summing those results with value 1.
Eventually in the column I would have something like this:

Date	Sign	%	res	sum
2020-10-06	1.0	-1.594774	-1	-1.594774
2020-10-07	1.0	1.847843	1
2020-10-08	1.0	0.493660	1
2020-10-09	1.0	1.372718	1
2020-10-12	1.0	2.495056	1	6.209277
2020-10-13	1.0	-0.104176	-1
2020-10-14	1.0	-0.808668	-1
2020-10-15	1.0	-0.468337	-1
2020-10-16	1.0	-0.362510	-1
2020-10-19	-1.0	-1.678558	-1	-3,422249
2020-10-20	-1.0	0.326578	1
2020-10-21	-1.0	-0.276889	-1
2020-10-22	-1.0	0.185289	1
2020-10-23	-1.0	0.366033	1
2020-10-26	-1.0	-1.666879	-1

I tried this code but the result is wrong. I can't understand why it sums incorrectly.

summ = 0
fin = []
for sm in df1['res']:
    if sm == 1:
        summ  = df1['%'] 
        fin.append(summ)
    elif sm == -1:
        summ  = df1['%'] 
        fin.append(summ)

fin

[Date
 2020-10-06    -754.327977
 2020-10-07     874.029912
 2020-10-08     233.501375
 2020-10-09     649.295754
 2020-10-12    1180.161504
                  ...     
 2022-08-16     -92.054601
 2022-08-17    -601.132479
 2022-08-18      99.302686
 2022-08-19    -968.395810
 2022-08-22   -1236.403534
 Name: %, Length: 473, dtype: float64,
 Date
 2020-10-06    -754.327977
 2020-10-07     874.029912
 2020-10-08     233.501375
 2020-10-09     649.295754
 2020-10-12    1180.161504

I would also like to add fin in the df1. I tried df1['fin'] = fin, but it gives me no result. Any ideas, thanks

CodePudding user response：

Assume that you want to keep a subtotal of all consecutive rows that have the same res value, you need to keep track of the current res value and change your code accordingly:

summ = 0
fin = []

current_sm = df1['res'][0] # this is so the code work with first index
for index, sm in enumerate(df1['res']):
    if sm == current_sm: # same res value as previous, keep adding to the sum
        summ  = df1['%'][index]
    else: # res change, save the subtotal, setup for the next
       fin.append(summ)
       current_sm = sm
       summ = df1['%'][index]
fin.append(summ) # Collect subtotal for the last batch

CodePudding user response：

import pandas as pd
import numpy as np
df = pd.DataFrame({'%':[-1.594774,1.847843,0.493660,1.372718,2.495056,-0.104176,-0.808668,-0.468337,-0.362510,-1.678558,0.326578,-0.276889,0.185289,0.366033,-1.666879],
                   'res':[-1,1,1,1,1,-1,-1,-1,-1,-1,1,-1,1,1,-1]})

Initialize sum column:

df['sum'] = np.nan

First figure out the index where the sign change is happening:

l = df.index[df['res']!=df['res'].shift(-1)]

The indices:

Int64Index([0, 4, 9, 10, 11, 13, 14], dtype='int64')

Then loop through this list and calculate the sum of sub-sequence:

df.loc[l[0],'sum'] = df.loc[0:l[0],'%'].sum()

for i in range(1,len(l)):
    df.loc[l[i],'sum'] = df.loc[l[i-1] 1:l[i],'%'].sum()

Output:

           %  res       sum
0  -1.594774   -1 -1.594774
1   1.847843    1       NaN
2   0.493660    1       NaN
3   1.372718    1       NaN
4   2.495056    1  6.209277
5  -0.104176   -1       NaN
6  -0.808668   -1       NaN
7  -0.468337   -1       NaN
8  -0.362510   -1       NaN
9  -1.678558   -1 -3.422249
10  0.326578    1  0.326578
11 -0.276889   -1 -0.276889
12  0.185289    1       NaN
13  0.366033    1  0.551322
14 -1.666879   -1 -1.666879