this data comes from stocks API, but unfortunately the date interval is not consistent i want to combine rows based on a given interval for example 1 hour or 15 minutes, to easily plot points on chart.
for example consider this dataframe, notice the date/time is not consistent, if i want to combine rows based on 30min interval.
date open high low close volume
0 2022-08-18 15:20:00 48.80 48.80 48.80 48.80 100
1 2022-08-18 15:18:00 48.80 48.80 48.80 48.80 2000
2 2022-08-18 15:16:00 48.80 48.80 48.80 48.80 4221
3 2022-08-18 15:11:00 48.75 48.75 48.75 48.75 275
4 2022-08-18 15:00:00 48.80 48.80 48.80 48.80 2
5 2022-08-18 14:59:00 48.75 48.80 48.75 48.80 1000
6 2022-08-18 14:58:00 48.70 48.70 48.65 48.65 500
7 2022-08-18 14:57:00 48.70 48.70 48.70 48.70 1060
8 2022-08-18 14:56:00 48.65 48.65 48.65 48.65 207
9 2022-08-18 14:55:00 48.70 48.70 48.70 48.70 118
10 2022-08-18 14:53:00 48.70 48.70 48.70 48.70 60
11 2022-08-18 14:51:00 48.55 48.65 48.55 48.65 1937
12 2022-08-18 14:50:00 48.65 48.65 48.65 48.65 120
13 2022-08-18 14:48:00 48.60 48.60 48.55 48.55 880
14 2022-08-18 14:46:00 48.60 48.60 48.60 48.60 270
15 2022-08-18 14:44:00 48.60 48.60 48.60 48.60 300
16 2022-08-18 14:39:00 48.60 48.60 48.60 48.60 40
17 2022-08-18 14:38:00 48.60 48.60 48.60 48.60 30
18 2022-08-18 14:33:00 48.65 48.65 48.65 48.65 297
19 2022-08-18 14:30:00 48.65 48.65 48.65 48.65 345
20 2022-08-18 14:29:00 48.50 48.55 48.45 48.55 1175
21 2022-08-18 14:26:00 48.60 48.60 48.45 48.45 2256
22 2022-08-18 14:25:00 48.60 48.60 48.60 48.60 325
23 2022-08-18 14:20:00 48.60 48.70 48.60 48.70 1834
24 2022-08-18 14:19:00 48.65 48.65 48.55 48.60 3545
25 2022-08-18 14:18:00 48.50 48.65 48.50 48.65 848
26 2022-08-18 14:17:00 48.50 48.55 48.50 48.55 2160
27 2022-08-18 14:16:00 48.65 48.65 48.60 48.60 254
rows 19 should be combine all the values from 19 to 27, low would be the lowest value in these rows and so on.
I need a flexible solution that will work with any interval.
please let me know if my question is not clear, thank you
CodePudding user response:
Use pd.Grouper
. Analyzing stock prices with pandas is also common enough that the library includes an Open-High-Low-Close (ohlc
) function to deal with it.
df.groupby(pd.Grouper(key="date", freq="15T"))["close"].ohlc()
15T
means 15 minutes. You can find other time units in Offset aliases.