Home > Blockchain >  Replace column with some rows of another column
Replace column with some rows of another column

Time:05-18

I have following dataframe:

midPrice    Change %  Spike  New Oilprice
92.20000    0.00      0      92.043405 
92.26454    0.07      0      92.049689
91.96950    -0.32     0      91.979751
91.73958    -0.25     0      91.844369
91.78985    0.05      0      91.724690
91.41000    -0.41     0      91.568880
91.18148    -0.25     0      91.690812
91.24257    0.07      0      91.858391
90.95352    -0.32     0      92.016806 
93.24000    2.51      1      92.139872
93.31013    0.08      0      92.321622
93.00690    -0.32     0      92.542687
92.77438    -0.25     0      92.727070
92.86400    0.10      0      92.949655

and whenever I have a Spike (1) in the column, I want to replace the 5 rows after the spike (including) with the new oil prices. The rest of the rows are being kept as they are.

Any ideas how to solve that? I tried the code based on following:

  1. Iterate through the df (for loop)
  2. If/else statement if spike == 1 then replace following 5 rows with values of new oil prices / else: keep oil prices
def spike(i):   
 
    for i in df['Spike']: 
         if i.loc == 1: 
              df['midPrice'].replace(df['New Oilprice'][i:5])`

It unfortunately doesn't work and I\m not so strong with pandas. I tried mapping the function as well on the dataframe which didn't work either. I would appreciate any help

CodePudding user response:

Assuming the df is sorted by time in ascending order (as I've seen in the edit history of your question that you had a time column), you could use a mask like so:

mask = df['Spike'].eq(1).where(df['Spike'].eq(1)).fillna(method='ffill', limit=4).fillna(False)
df.loc[mask, 'midPrice'] = df['New Oilprice']

print(df)
     midPrice  Change %  Spike  New Oilprice
0   92.200000      0.00      0     92.043405
1   92.264540      0.07      0     92.049689
2   91.969500     -0.32      0     91.979751
3   91.739580     -0.25      0     91.844369
4   91.789850      0.05      0     91.724690
5   91.410000     -0.41      0     91.568880
6   91.181480     -0.25      0     91.690812
7   91.242570      0.07      0     91.858391
8   90.953520     -0.32      0     92.016806
9   92.139872      2.51      1     92.139872
10  92.321622      0.08      0     92.321622
11  92.542687     -0.32      0     92.542687
12  92.727070     -0.25      0     92.727070
13  92.949655      0.10      0     92.949655

EDIT - 2 rows before, 3 rows after:

You can adjust the mask with another fillna:


mask = df['Spike'].eq(1).where(df['Spike'].eq(1)).fillna(method='bfill', limit=2).fillna(method='ffill', limit=3).fillna(False)
df.loc[mask, 'midPrice'] = df['New Oilprice']
​
print(df)
     midPrice  Change %  Spike  New Oilprice
0   92.200000      0.00      0     92.043405
1   92.264540      0.07      0     92.049689
2   91.969500     -0.32      0     91.979751
3   91.739580     -0.25      0     91.844369
4   91.789850      0.05      0     91.724690
5   91.410000     -0.41      0     91.568880
6   91.181480     -0.25      0     91.690812
7   91.858391      0.07      0     91.858391
8   92.016806     -0.32      0     92.016806
9   92.139872      2.51      1     92.139872
10  92.321622      0.08      0     92.321622
11  92.542687     -0.32      0     92.542687
12  92.727070     -0.25      0     92.727070
13  92.949655      0.10      0     92.949655
  • Related