I have following dataframe:
midPrice Change % Spike New Oilprice
92.20000 0.00 0 92.043405
92.26454 0.07 0 92.049689
91.96950 -0.32 0 91.979751
91.73958 -0.25 0 91.844369
91.78985 0.05 0 91.724690
91.41000 -0.41 0 91.568880
91.18148 -0.25 0 91.690812
91.24257 0.07 0 91.858391
90.95352 -0.32 0 92.016806
93.24000 2.51 1 92.139872
93.31013 0.08 0 92.321622
93.00690 -0.32 0 92.542687
92.77438 -0.25 0 92.727070
92.86400 0.10 0 92.949655
and whenever I have a Spike (1) in the column, I want to replace the 5 rows after the spike (including) with the new oil prices. The rest of the rows are being kept as they are.
Any ideas how to solve that? I tried the code based on following:
- Iterate through the df (for loop)
- If/else statement if spike == 1 then replace following 5 rows with values of new oil prices / else: keep oil prices
def spike(i):
for i in df['Spike']:
if i.loc == 1:
df['midPrice'].replace(df['New Oilprice'][i:5])`
It unfortunately doesn't work and I\m not so strong with pandas. I tried mapping the function as well on the dataframe which didn't work either. I would appreciate any help
CodePudding user response:
Assuming the df is sorted by time in ascending order (as I've seen in the edit history of your question that you had a time column), you could use a mask like so:
mask = df['Spike'].eq(1).where(df['Spike'].eq(1)).fillna(method='ffill', limit=4).fillna(False)
df.loc[mask, 'midPrice'] = df['New Oilprice']
print(df)
midPrice Change % Spike New Oilprice
0 92.200000 0.00 0 92.043405
1 92.264540 0.07 0 92.049689
2 91.969500 -0.32 0 91.979751
3 91.739580 -0.25 0 91.844369
4 91.789850 0.05 0 91.724690
5 91.410000 -0.41 0 91.568880
6 91.181480 -0.25 0 91.690812
7 91.242570 0.07 0 91.858391
8 90.953520 -0.32 0 92.016806
9 92.139872 2.51 1 92.139872
10 92.321622 0.08 0 92.321622
11 92.542687 -0.32 0 92.542687
12 92.727070 -0.25 0 92.727070
13 92.949655 0.10 0 92.949655
EDIT - 2 rows before, 3 rows after:
You can adjust the mask with another fillna
:
mask = df['Spike'].eq(1).where(df['Spike'].eq(1)).fillna(method='bfill', limit=2).fillna(method='ffill', limit=3).fillna(False)
df.loc[mask, 'midPrice'] = df['New Oilprice']
print(df)
midPrice Change % Spike New Oilprice
0 92.200000 0.00 0 92.043405
1 92.264540 0.07 0 92.049689
2 91.969500 -0.32 0 91.979751
3 91.739580 -0.25 0 91.844369
4 91.789850 0.05 0 91.724690
5 91.410000 -0.41 0 91.568880
6 91.181480 -0.25 0 91.690812
7 91.858391 0.07 0 91.858391
8 92.016806 -0.32 0 92.016806
9 92.139872 2.51 1 92.139872
10 92.321622 0.08 0 92.321622
11 92.542687 -0.32 0 92.542687
12 92.727070 -0.25 0 92.727070
13 92.949655 0.10 0 92.949655