I found this question but I couldn't understand the problem and I couldn't adjust it for my case:
Usage example:
let's assume that I want to get the value of competition
and market_name
from line 5 of my CSV and filter lines 1,2,3 and 4 that have the same values in these columns and sum the column back, if back is > 0
, then the new column called invest will show TRUE
, otherwise FALSE
. I want to do this by going through all the lines of the Dataframe, always calculating only the previous lines. The idea is to know whether or not it would be profitable to invest based on existing records in the history.
In order for the code to become malleable for different types of filters, I did it this way:
My 100% executable code for testing:
import pandas as pd
from functools import reduce
df = pd.read_csv('test.csv')
df = df[df['result'].notnull()].reset_index(drop=True)
options = [
'country',
'competition',
'market_name',
'runner_name',
'odds',
'total_matched_vls',
'minute_traded_vls'
]
def my_func(cb, i):
dfilter = df[:i]
filter = dfilter[reduce(lambda x, y : x & (df[y[0]] == y[1]), zip(cb, map(lambda x: df.iloc[i][x], cb)), True)]
back_sum = filter['back'].sum()
aaa = True
if back_sum <= 0:
aaa = False
return aaa
def combs(a):
if len(a) == 0:
return [[]]
cs = []
for c in combs(a[1:]):
cs = [c, c [a[0]]]
return cs
combinations = combs(options)
combinations = [x for x in combinations if x != []]
for cbnt in combinations:
df['invest'] = [my_func(cbnt,i) for i in range(len(df))]
true_backs = df[(df['invest'] == True)]['back']
if (true_backs.sum() > 0):
print(cbnt)
CSV sample example:
clock_now,country,competition,match_id,match_name,market_id,market_name,runner_id,runner_name,total_matched,total_matched_vls,one,minute_traded,minute_traded_vls,odds,result,back,lay
2022/05/22 12:16,GB,English Premier League,31459682,Brighton v West Ham,1.199215861,First Half Goals 0.5,5851483,Over 0.5 Goals,2692.37,milhares,['149.12'♫§ '7.039999999999964'♫§ '48.190000000000055'♫§ '24.08999999999992'♫§ '75.65000000000009'],1797.9199999999998,milhares,1.7,WINNER,0.6545,-1
2022/05/22 12:20,HU,Hungarian NB II,31473686,Vasas v Soroksar,1.199438855,Over/Under 6.5 Goals,2542448,Under 6.5 Goals,5801.98,milhares,['125.76'♫§ '150.4699999999998'♫§ '200.79'♫§ '51.43000000000029'♫§ '478.25999999999976'],3011.07,milhares,1.01,LOSER,0.00935000000000001,-1
2022/05/22 12:21,LU,Luxembourg Division Nationale,31473752,FC Differdange 03 v Progres Niedercorn,1.199439863,Over/Under 0.5 Goals,5851483,Over 0.5 Goals,4451.85,milhares,['0.0'♫§ '12.889999999999873'♫§ '22.44000000000005'♫§ '1.2899999999999636'♫§ '219.71000000000004'],1038.0600000000004,milhares,1.71,WINNER,0.66385,-1
2022/05/22 12:16,GB,English Premier League,31459682,Brighton v West Ham,1.199215861,First Half Goals 0.5,5851483,Over 0.5 Goals,2692.37,milhares,['149.12'♫§ '7.039999999999964'♫§ '48.190000000000055'♫§ '24.08999999999992'♫§ '75.65000000000009'],1797.9199999999998,milhares,1.7,WINNER,0.6545,-1
2022/05/22 12:20,HU,Hungarian NB II,31473686,Vasas v Soroksar,1.199438855,Over/Under 6.5 Goals,2542448,Under 6.5 Goals,5801.98,milhares,['125.76'♫§ '150.4699999999998'♫§ '200.79'♫§ '51.43000000000029'♫§ '478.25999999999976'],3011.07,milhares,1.01,LOSER,0.00935000000000001,-1
2022/05/22 12:21,LU,Luxembourg Division Nationale,31473752,FC Differdange 03 v Progres Niedercorn,1.199439863,Over/Under 0.5 Goals,5851483,Over 0.5 Goals,4451.85,milhares,['0.0'♫§ '12.889999999999873'♫§ '22.44000000000005'♫§ '1.2899999999999636'♫§ '219.71000000000004'],1038.0600000000004,milhares,1.71,WINNER,0.66385,-1
2022/05/22 12:16,GB,English Premier League,31459682,Brighton v West Ham,1.199215861,First Half Goals 0.5,5851483,Over 0.5 Goals,2692.37,milhares,['149.12'♫§ '7.039999999999964'♫§ '48.190000000000055'♫§ '24.08999999999992'♫§ '75.65000000000009'],1797.9199999999998,milhares,1.7,WINNER,0.6545,-1
2022/05/22 12:20,HU,Hungarian NB II,31473686,Vasas v Soroksar,1.199438855,Over/Under 6.5 Goals,2542448,Under 6.5 Goals,5801.98,milhares,['125.76'♫§ '150.4699999999998'♫§ '200.79'♫§ '51.43000000000029'♫§ '478.25999999999976'],3011.07,milhares,1.01,LOSER,0.00935000000000001,-1
2022/05/22 12:21,LU,Luxembourg Division Nationale,31473752,FC Differdange 03 v Progres Niedercorn,1.199439863,Over/Under 0.5 Goals,5851483,Over 0.5 Goals,4451.85,milhares,['0.0'♫§ '12.889999999999873'♫§ '22.44000000000005'♫§ '1.2899999999999636'♫§ '219.71000000000004'],1038.0600000000004,milhares,1.71,WINNER,0.66385,-1
2022/05/22 12:16,GB,English Premier League,31459682,Brighton v West Ham,1.199215861,First Half Goals 0.5,5851483,Over 0.5 Goals,2692.37,milhares,['149.12'♫§ '7.039999999999964'♫§ '48.190000000000055'♫§ '24.08999999999992'♫§ '75.65000000000009'],1797.9199999999998,milhares,1.7,WINNER,0.6545,-1
2022/05/22 12:20,HU,Hungarian NB II,31473686,Vasas v Soroksar,1.199438855,Over/Under 6.5 Goals,2542448,Under 6.5 Goals,5801.98,milhares,['125.76'♫§ '150.4699999999998'♫§ '200.79'♫§ '51.43000000000029'♫§ '478.25999999999976'],3011.07,milhares,1.01,LOSER,0.00935000000000001,-1
2022/05/22 12:21,LU,Luxembourg Division Nationale,31473752,FC Differdange 03 v Progres Niedercorn,1.199439863,Over/Under 0.5 Goals,5851483,Over 0.5 Goals,4451.85,milhares,['0.0'♫§ '12.889999999999873'♫§ '22.44000000000005'♫§ '1.2899999999999636'♫§ '219.71000000000004'],1038.0600000000004,milhares,1.71,WINNER,0.66385,-1
2022/05/22 12:16,GB,English Premier League,31459682,Brighton v West Ham,1.199215861,First Half Goals 0.5,5851483,Over 0.5 Goals,2692.37,milhares,['149.12'♫§ '7.039999999999964'♫§ '48.190000000000055'♫§ '24.08999999999992'♫§ '75.65000000000009'],1797.9199999999998,milhares,1.7,WINNER,0.6545,-1
2022/05/22 12:20,HU,Hungarian NB II,31473686,Vasas v Soroksar,1.199438855,Over/Under 6.5 Goals,2542448,Under 6.5 Goals,5801.98,milhares,['125.76'♫§ '150.4699999999998'♫§ '200.79'♫§ '51.43000000000029'♫§ '478.25999999999976'],3011.07,milhares,1.01,LOSER,0.00935000000000001,-1
2022/05/22 12:21,LU,Luxembourg Division Nationale,31473752,FC Differdange 03 v Progres Niedercorn,1.199439863,Over/Under 0.5 Goals,5851483,Over 0.5 Goals,4451.85,milhares,['0.0'♫§ '12.889999999999873'♫§ '22.44000000000005'♫§ '1.2899999999999636'♫§ '219.71000000000004'],1038.0600000000004,milhares,1.71,WINNER,0.66385,-1
But i receive this Warning:
UserWarning: Boolean Series key will be reindexed to match DataFrame index.
filter = dfilter[reduce(lambda x, y : x & (df[y[0]] == y[1]), zip(cb, map(lambda x: df.iloc[i][x], cb)), True)]
If I need the Series to generate the malleable option, how should I proceed to not receive this warning?
CodePudding user response:
It looks like you want a groupby.cumsum
, and compare is to the current value:
m1 = df.groupby('country')['back'].cumsum().gt(df['back'])
m2 = df.groupby(['competition','market_name'])['back'].cumsum().gt(df['back'])
df['invest'] = m1&m2
For many conditions:
from functools import reduce
combinations = [['country'],['competition','market_name']]
df['invest'] = reduce(np.logical_and,
(df.groupby(g)['back'].cumsum().gt(df['back'])
for g in combinations)
)
Output (invest column only):
0 False
1 False
2 False
3 True
4 True
5 True
6 True
7 True
8 True
9 True
10 True
11 True
12 True
13 True
14 True
Name: back, dtype: bool