Relative newbie with Python and Pandas, finally admitting defeat on not being able to figure this out myself. I have a pandas Dataframe from our energy suppliers API, each row is a 30min interval showing wholesale energy costs in p/kWH 'value_exc_vat', the solar output for the house 'export' and a datetime stamp 'datetime'.
| index |'value_exc_vat'|'datetime'|'export'|'hour'|'export_rate'|'export_rate_var'|
'hour' is taken from datetime for each row e.g. 13, 14, 15, 16, etc.
To calculate the price/kWh we are paid i need to calculate
0.97 x 'value_exc_vat' peak_rate_uplift
peak_rate_uplift is only applied during the hours 16:19 inclusive
I've tried just about every method i can think of but i can't get this to work.
peak_rate = [16,17,18,19]
for hour in df['hour']:
if hour == peak_rate:
df['export_rate_var'] = (df['export_rate'] peak_rate_uplift)
else:
df['export_rate_var'] = df['export_rate']
Printing the output from the if function i can see that 'hour' is being selected for the correct values but the remainder of the statement doesn't then add the peak_rate_uplift I would expect.
Any advice or help on how to apply the addition to the selected row would be appreciated, feels like it should be something simple but I've been at this for 3 days now...
CodePudding user response:
Does this work:
peak_rate = [16,17,18,19]
for i in range(len(df)):
if df.hour.iloc[i].isin(peak_rate):
df['export_rate_var'] = (df['export_rate'] peak_rate_uplift)
else:
df['export_rate_var'] = df['export_rate']
CodePudding user response:
You could use:
peak_rate = [16,17,18,19]
df['export_rate_var'] = (df['export_rate'] df.hour.isin(peak_rate) * peak_rate_uplift)
Where df.hour.isin([peak_rate])
returns a boolean series. This multiplied with the integer peak_rate_uplift
gives a Series of integers which is 0 where the hour is not in the peak rate hours.