I have a bunch of daily ohlc data like so:
date,volume,open,close,high,low
2022-05-02,1756159.0,118.38,119.57,120.34,116.49
2022-05-03,3217838.0,119.72,122.4,123.98,119.09
2022-05-04,2460350.0,121.69,126.3,126.69,121.44
2022-05-05,2123645.0,124.62,122.15,125.21,120.8
2022-05-06,1629034.0,120.88,121.08,121.88,118.0
2022-05-09,1861704.0,119.13,113.11,119.13,112.64
2022-05-10,2141753.0,115.44,116.64,117.94,113.14
2022-05-11,1607013.0,115.7,113.99,118.0,113.84
2022-05-12,1338023.0,113.61,116.13,116.25,112.78
2022-05-13,1328411.0,117.38,119.38,120.715,117.27
I want to return a new dataframe which contains a new column called previous close, which will be set to the close of the previous day.
I've tried:
final_df.loc[0, 'previous close'] = final_df.loc[0, 'c']
for i in range(1, len(df)):
final_df.loc[i, 'previouse close'] = df.loc[i-1, 'c']
And using np.where()
but I've had no luck. Here's an example of how I need it to look:
|------------|-----------|--------|--------|---------|--------|----------------|
| date | volume | open | close | high | low | previous close |
|------------|-----------|--------|--------|---------|--------|----------------|
| 2022-05-02 | 1756159.0 | 118.38 | 119.57 | 120.34 | 116.49 | |
|------------|-----------|--------|--------|---------|--------|----------------|
| 2022-05-03 | 3217838.0 | 119.72 | 122.4 | 123.98 | 119.09 | 119.57 |
|------------|-----------|--------|--------|---------|--------|----------------|
| 2022-05-04 | 2460350.0 | 121.69 | 126.3 | 126.69 | 121.44 | 122.4 |
|------------|-----------|--------|--------|---------|--------|----------------|
| 2022-05-05 | 2123645.0 | 124.62 | 122.15 | 125.21 | 120.8 | 126.3 |
|------------|-----------|--------|--------|---------|--------|----------------|
| 2022-05-06 | 1629034.0 | 120.88 | 121.08 | 121.88 | 118.0 | 122.15 |
|------------|-----------|--------|--------|---------|--------|----------------|
CodePudding user response:
You can use Series.shift
df['previous close'] = df['close'].shift()
If you want to fill the row only when date
difference is exactly 1 day
df['date'] = pd.to_datetime(df['date'])
df['previous close'] = df['close'].shift().mask(df['date'].diff().ne(pd.Timedelta(days=1)), pd.NA)
print(df)
date volume open close high low previous close
0 2022-05-02 1756159.0 118.38 119.57 120.340 116.49 NaN
1 2022-05-03 3217838.0 119.72 122.40 123.980 119.09 119.57
2 2022-05-04 2460350.0 121.69 126.30 126.690 121.44 122.40
3 2022-05-05 2123645.0 124.62 122.15 125.210 120.80 126.30
4 2022-05-06 1629034.0 120.88 121.08 121.880 118.00 122.15
5 2022-05-09 1861704.0 119.13 113.11 119.130 112.64 NaN
6 2022-05-10 2141753.0 115.44 116.64 117.940 113.14 113.11
7 2022-05-11 1607013.0 115.70 113.99 118.000 113.84 116.64
8 2022-05-12 1338023.0 113.61 116.13 116.250 112.78 113.99
9 2022-05-13 1328411.0 117.38 119.38 120.715 117.27 116.13