Home > Blockchain >  Sum percentages by condition
Sum percentages by condition

Time:08-25

everyone. I don't know how to solve this. I would like to sum the percentages according to the result. Let me explain; all the % values for example with result i -1 should be summed (leaving the sign unchanged, so -1578 -0.987, etc., then when the sign changes, the value goes to zero and it starts summing those results with value 1.
Eventually in the column I would have something like this:

Date Sign % res sum
2020-10-06 1.0 -1.594774 -1 -1.594774
2020-10-07 1.0 1.847843 1
2020-10-08 1.0 0.493660 1
2020-10-09 1.0 1.372718 1
2020-10-12 1.0 2.495056 1 6.209277
2020-10-13 1.0 -0.104176 -1
2020-10-14 1.0 -0.808668 -1
2020-10-15 1.0 -0.468337 -1
2020-10-16 1.0 -0.362510 -1
2020-10-19 -1.0 -1.678558 -1 -3,422249
2020-10-20 -1.0 0.326578 1
2020-10-21 -1.0 -0.276889 -1
2020-10-22 -1.0 0.185289 1
2020-10-23 -1.0 0.366033 1
2020-10-26 -1.0 -1.666879 -1

I tried this code but the result is wrong. I can't understand why it sums incorrectly.

summ = 0
fin = []
for sm in df1['res']:
    if sm == 1:
        summ  = df1['%'] 
        fin.append(summ)
    elif sm == -1:
        summ  = df1['%'] 
        fin.append(summ)

fin   

[Date
 2020-10-06    -754.327977
 2020-10-07     874.029912
 2020-10-08     233.501375
 2020-10-09     649.295754
 2020-10-12    1180.161504
                  ...     
 2022-08-16     -92.054601
 2022-08-17    -601.132479
 2022-08-18      99.302686
 2022-08-19    -968.395810
 2022-08-22   -1236.403534
 Name: %, Length: 473, dtype: float64,
 Date
 2020-10-06    -754.327977
 2020-10-07     874.029912
 2020-10-08     233.501375
 2020-10-09     649.295754
 2020-10-12    1180.161504

I would also like to add fin in the df1. I tried df1['fin'] = fin, but it gives me no result. Any ideas, thanks

CodePudding user response:

Assume that you want to keep a subtotal of all consecutive rows that have the same res value, you need to keep track of the current res value and change your code accordingly:

summ = 0
fin = []

current_sm = df1['res'][0] # this is so the code work with first index
for index, sm in enumerate(df1['res']):
    if sm == current_sm: # same res value as previous, keep adding to the sum
        summ  = df1['%'][index]
    else: # res change, save the subtotal, setup for the next
       fin.append(summ)
       current_sm = sm
       summ = df1['%'][index]
fin.append(summ) # Collect subtotal for the last batch

CodePudding user response:

import pandas as pd
import numpy as np
df = pd.DataFrame({'%':[-1.594774,1.847843,0.493660,1.372718,2.495056,-0.104176,-0.808668,-0.468337,-0.362510,-1.678558,0.326578,-0.276889,0.185289,0.366033,-1.666879],
                   'res':[-1,1,1,1,1,-1,-1,-1,-1,-1,1,-1,1,1,-1]})

Initialize sum column:

df['sum'] = np.nan

First figure out the index where the sign change is happening:

l = df.index[df['res']!=df['res'].shift(-1)]

The indices:

Int64Index([0, 4, 9, 10, 11, 13, 14], dtype='int64')

Then loop through this list and calculate the sum of sub-sequence:

df.loc[l[0],'sum'] = df.loc[0:l[0],'%'].sum()

for i in range(1,len(l)):
    df.loc[l[i],'sum'] = df.loc[l[i-1] 1:l[i],'%'].sum()

Output:

           %  res       sum
0  -1.594774   -1 -1.594774
1   1.847843    1       NaN
2   0.493660    1       NaN
3   1.372718    1       NaN
4   2.495056    1  6.209277
5  -0.104176   -1       NaN
6  -0.808668   -1       NaN
7  -0.468337   -1       NaN
8  -0.362510   -1       NaN
9  -1.678558   -1 -3.422249
10  0.326578    1  0.326578
11 -0.276889   -1 -0.276889
12  0.185289    1       NaN
13  0.366033    1  0.551322
14 -1.666879   -1 -1.666879
  • Related