I have a dataframe with OHLC data. I need to get the close price into the pandas series, using the timestamp column as the index.
I am reading from a sqlite db into my df:
conn = sql.connect('allStockData.db')
price = pd.read_sql_query("SELECT * from ohlc_minutes", conn)
price['timestamp'] = pd.to_datetime(price['timestamp'])
print(price)
Which returns:
timestamp open high low close volume trade_count vwap symbol volume_10_day
0 2022-09-16 08:00:00 00:00 3.19 3.570 3.19 3.350 66475 458 3.404240 AAOI NaN
1 2022-09-16 08:05:00 00:00 3.35 3.440 3.33 3.430 28925 298 3.381131 AAOI NaN
2 2022-09-16 08:10:00 00:00 3.44 3.520 3.35 3.400 62901 643 3.445096 AAOI NaN
3 2022-09-16 08:15:00 00:00 3.37 3.390 3.31 3.360 17943 184 3.339721 AAOI NaN
4 2022-09-16 08:20:00 00:00 3.36 3.410 3.34 3.400 29123 204 3.383370 AAOI NaN
... ... ... ... ... ... ... ... ... ... ...
8759 2022-09-08 23:35:00 00:00 1.35 1.360 1.35 1.355 3835 10 1.350613 RUBY 515994.5
8760 2022-09-08 23:40:00 00:00 1.36 1.360 1.35 1.350 2780 7 1.353687 RUBY 515994.5
8761 2022-09-08 23:45:00 00:00 1.35 1.355 1.35 1.355 7080 11 1.350424 RUBY 515994.5
8762 2022-09-08 23:50:00 00:00 1.35 1.360 1.33 1.360 11664 30 1.351104 RUBY 515994.5
8763 2022-09-08 23:55:00 00:00 1.36 1.360 1.33 1.340 21394 32 1.348223 RUBY 515994.5
[8764 rows x 10 columns]
When I try to get the close into a series with the timestamp:
price = pd.Series(price['close'], index=price['timestamp'])
It returns a bunch of NaNs:
2022-09-16 08:00:00 00:00 NaN
2022-09-16 08:05:00 00:00 NaN
2022-09-16 08:10:00 00:00 NaN
2022-09-16 08:15:00 00:00 NaN
2022-09-16 08:20:00 00:00 NaN
..
2022-09-08 23:35:00 00:00 NaN
2022-09-08 23:40:00 00:00 NaN
2022-09-08 23:45:00 00:00 NaN
2022-09-08 23:50:00 00:00 NaN
2022-09-08 23:55:00 00:00 NaN
Name: close, Length: 8764, dtype: float64
If I remove the index:
price = pd.Series(price['close'])
The close is returned normally:
0 3.350
1 3.430
2 3.400
3 3.360
4 3.400
...
8759 1.355
8760 1.350
8761 1.355
8762 1.360
8763 1.340
Name: close, Length: 8764, dtype: float64
How can I return the close column as a pandas series, using my timestamp column as the index?
CodePudding user response:
It's because price['close']
has it's own index which is incompatible with timestamp
. Try use .values
instead:
price = pd.Series(price['close'].values, index=price['timestamp'])
CodePudding user response:
I needed to set the timestamp to the index before getting the the close as a series:
conn = sql.connect('allStockData.db')
price = pd.read_sql_query("SELECT * from ohlc_minutes", conn)
price['timestamp'] = pd.to_datetime(price['timestamp'])
price = price.set_index('timestamp')
print(price)
price = pd.Series(price['close'])
print(price)
Gives:
2022-09-16 08:00:00 00:00 3.350
2022-09-16 08:05:00 00:00 3.430
2022-09-16 08:10:00 00:00 3.400
2022-09-16 08:15:00 00:00 3.360
2022-09-16 08:20:00 00:00 3.400
...
2022-09-08 23:35:00 00:00 1.355
2022-09-08 23:40:00 00:00 1.350
2022-09-08 23:45:00 00:00 1.355
2022-09-08 23:50:00 00:00 1.360
2022-09-08 23:55:00 00:00 1.340
Name: close, Length: 8764, dtype: float64