Home > Enterprise >  merge rows based on a specific date interval
merge rows based on a specific date interval

Time:08-20

this data comes from stocks API, but unfortunately the date interval is not consistent i want to combine rows based on a given interval for example 1 hour or 15 minutes, to easily plot points on chart.

for example consider this dataframe, notice the date/time is not consistent, if i want to combine rows based on 30min interval.

                   date   open   high    low  close  volume
0   2022-08-18 15:20:00  48.80  48.80  48.80  48.80     100
1   2022-08-18 15:18:00  48.80  48.80  48.80  48.80    2000
2   2022-08-18 15:16:00  48.80  48.80  48.80  48.80    4221
3   2022-08-18 15:11:00  48.75  48.75  48.75  48.75     275
4   2022-08-18 15:00:00  48.80  48.80  48.80  48.80       2
5   2022-08-18 14:59:00  48.75  48.80  48.75  48.80    1000
6   2022-08-18 14:58:00  48.70  48.70  48.65  48.65     500
7   2022-08-18 14:57:00  48.70  48.70  48.70  48.70    1060
8   2022-08-18 14:56:00  48.65  48.65  48.65  48.65     207
9   2022-08-18 14:55:00  48.70  48.70  48.70  48.70     118
10  2022-08-18 14:53:00  48.70  48.70  48.70  48.70      60
11  2022-08-18 14:51:00  48.55  48.65  48.55  48.65    1937
12  2022-08-18 14:50:00  48.65  48.65  48.65  48.65     120
13  2022-08-18 14:48:00  48.60  48.60  48.55  48.55     880
14  2022-08-18 14:46:00  48.60  48.60  48.60  48.60     270
15  2022-08-18 14:44:00  48.60  48.60  48.60  48.60     300
16  2022-08-18 14:39:00  48.60  48.60  48.60  48.60      40
17  2022-08-18 14:38:00  48.60  48.60  48.60  48.60      30
18  2022-08-18 14:33:00  48.65  48.65  48.65  48.65     297
19  2022-08-18 14:30:00  48.65  48.65  48.65  48.65     345
20  2022-08-18 14:29:00  48.50  48.55  48.45  48.55    1175
21  2022-08-18 14:26:00  48.60  48.60  48.45  48.45    2256
22  2022-08-18 14:25:00  48.60  48.60  48.60  48.60     325
23  2022-08-18 14:20:00  48.60  48.70  48.60  48.70    1834
24  2022-08-18 14:19:00  48.65  48.65  48.55  48.60    3545
25  2022-08-18 14:18:00  48.50  48.65  48.50  48.65     848
26  2022-08-18 14:17:00  48.50  48.55  48.50  48.55    2160
27  2022-08-18 14:16:00  48.65  48.65  48.60  48.60     254

rows 19 should be combine all the values from 19 to 27, low would be the lowest value in these rows and so on.

I need a flexible solution that will work with any interval.

please let me know if my question is not clear, thank you

CodePudding user response:

Use pd.Grouper. Analyzing stock prices with pandas is also common enough that the library includes an Open-High-Low-Close (ohlc) function to deal with it.

df.groupby(pd.Grouper(key="date", freq="15T"))["close"].ohlc()

15T means 15 minutes. You can find other time units in Offset aliases.

  • Related