everyone.
I don't know how to solve this.
I would like to sum the percentages according to the result.
Let me explain; all the % values for example with result i -1 should be summed (leaving the sign unchanged, so -1578 -0.987, etc., then when the sign changes, the value goes to zero and it starts summing those results with value 1.
Eventually in the column I would have something like this:
Date | Sign | % | res | sum |
---|---|---|---|---|
2020-10-06 | 1.0 | -1.594774 | -1 | -1.594774 |
2020-10-07 | 1.0 | 1.847843 | 1 | |
2020-10-08 | 1.0 | 0.493660 | 1 | |
2020-10-09 | 1.0 | 1.372718 | 1 | |
2020-10-12 | 1.0 | 2.495056 | 1 | 6.209277 |
2020-10-13 | 1.0 | -0.104176 | -1 | |
2020-10-14 | 1.0 | -0.808668 | -1 | |
2020-10-15 | 1.0 | -0.468337 | -1 | |
2020-10-16 | 1.0 | -0.362510 | -1 | |
2020-10-19 | -1.0 | -1.678558 | -1 | -3,422249 |
2020-10-20 | -1.0 | 0.326578 | 1 | |
2020-10-21 | -1.0 | -0.276889 | -1 | |
2020-10-22 | -1.0 | 0.185289 | 1 | |
2020-10-23 | -1.0 | 0.366033 | 1 | |
2020-10-26 | -1.0 | -1.666879 | -1 |
I tried this code but the result is wrong. I can't understand why it sums incorrectly.
summ = 0
fin = []
for sm in df1['res']:
if sm == 1:
summ = df1['%']
fin.append(summ)
elif sm == -1:
summ = df1['%']
fin.append(summ)
fin
[Date
2020-10-06 -754.327977
2020-10-07 874.029912
2020-10-08 233.501375
2020-10-09 649.295754
2020-10-12 1180.161504
...
2022-08-16 -92.054601
2022-08-17 -601.132479
2022-08-18 99.302686
2022-08-19 -968.395810
2022-08-22 -1236.403534
Name: %, Length: 473, dtype: float64,
Date
2020-10-06 -754.327977
2020-10-07 874.029912
2020-10-08 233.501375
2020-10-09 649.295754
2020-10-12 1180.161504
I would also like to add fin in the df1.
I tried df1['fin'] = fin
, but it gives me no result.
Any ideas, thanks
CodePudding user response:
Assume that you want to keep a subtotal of all consecutive rows that have the same res value, you need to keep track of the current res value and change your code accordingly:
summ = 0
fin = []
current_sm = df1['res'][0] # this is so the code work with first index
for index, sm in enumerate(df1['res']):
if sm == current_sm: # same res value as previous, keep adding to the sum
summ = df1['%'][index]
else: # res change, save the subtotal, setup for the next
fin.append(summ)
current_sm = sm
summ = df1['%'][index]
fin.append(summ) # Collect subtotal for the last batch
CodePudding user response:
import pandas as pd
import numpy as np
df = pd.DataFrame({'%':[-1.594774,1.847843,0.493660,1.372718,2.495056,-0.104176,-0.808668,-0.468337,-0.362510,-1.678558,0.326578,-0.276889,0.185289,0.366033,-1.666879],
'res':[-1,1,1,1,1,-1,-1,-1,-1,-1,1,-1,1,1,-1]})
Initialize sum column:
df['sum'] = np.nan
First figure out the index where the sign change is happening:
l = df.index[df['res']!=df['res'].shift(-1)]
The indices:
Int64Index([0, 4, 9, 10, 11, 13, 14], dtype='int64')
Then loop through this list and calculate the sum of sub-sequence:
df.loc[l[0],'sum'] = df.loc[0:l[0],'%'].sum()
for i in range(1,len(l)):
df.loc[l[i],'sum'] = df.loc[l[i-1] 1:l[i],'%'].sum()
Output:
% res sum
0 -1.594774 -1 -1.594774
1 1.847843 1 NaN
2 0.493660 1 NaN
3 1.372718 1 NaN
4 2.495056 1 6.209277
5 -0.104176 -1 NaN
6 -0.808668 -1 NaN
7 -0.468337 -1 NaN
8 -0.362510 -1 NaN
9 -1.678558 -1 -3.422249
10 0.326578 1 0.326578
11 -0.276889 -1 -0.276889
12 0.185289 1 NaN
13 0.366033 1 0.551322
14 -1.666879 -1 -1.666879