Below is the dataframe for one of the stock in indian market.
open high low close volume
date
2021-06-17 09:15:00 05:30 815.55 819.15 807.05 818.45 54372
2021-06-17 09:30:00 05:30 818.45 820.70 817.20 818.00 22417
2021-06-17 09:45:00 05:30 818.00 820.70 818.00 820.00 14685
2021-06-17 10:00:00 05:30 820.00 820.55 818.45 820.55 17466
2021-06-17 10:15:00 05:30 820.55 823.75 820.00 821.00 31069
2021-06-17 10:30:00 05:30 821.00 821.00 818.40 818.45 11139
2021-06-17 10:45:00 05:30 819.05 819.55 817.20 819.25 17747
2021-06-17 11:00:00 05:30 819.35 820.30 818.80 819.00 6770
2021-06-17 11:15:00 05:30 819.50 819.50 817.60 818.55 7616
2021-06-17 11:30:00 05:30 818.95 819.50 818.00 819.40 6856
2021-06-17 11:45:00 05:30 818.90 820.00 818.55 819.90 4147
2021-06-17 12:00:00 05:30 819.90 822.90 818.15 822.45 18668
2021-06-17 12:15:00 05:30 822.45 822.50 820.35 820.55 8529
2021-06-17 12:30:00 05:30 820.50 821.95 819.00 821.50 15214
2021-06-17 12:45:00 05:30 821.95 822.25 821.00 821.50 6072
2021-06-17 13:00:00 05:30 821.50 824.25 820.80 824.25 26428
2021-06-17 13:15:00 05:30 824.70 827.00 823.50 825.65 54448
2021-06-17 13:30:00 05:30 825.70 828.40 825.50 828.25 56071
2021-06-17 13:45:00 05:30 828.10 829.95 823.60 824.20 76385
2021-06-17 14:00:00 05:30 824.20 825.30 821.65 822.20 36921
2021-06-17 14:15:00 05:30 822.70 823.95 821.25 822.75 25237
2021-06-17 14:30:00 05:30 822.75 823.95 821.50 822.55 22952
2021-06-17 14:45:00 05:30 822.55 822.60 820.15 821.90 22285
2021-06-17 15:00:00 05:30 821.60 822.75 820.45 820.65 28557
2021-06-17 15:15:00 05:30 820.45 821.80 818.00 819.65 50378
2021-06-18 09:15:00 05:30 820.50 835.00 820.50 831.40 246519
2021-06-18 09:30:00 05:30 831.00 833.35 829.00 830.00 142612
2021-06-18 09:45:00 05:30 830.00 830.55 822.50 822.80 114185
2021-06-18 10:00:00 05:30 823.45 827.25 821.85 824.45 57072
2021-06-18 10:15:00 05:30 824.45 826.30 822.25 822.40 29404
2021-06-18 10:30:00 05:30 822.40 823.95 819.55 821.00 48454
2021-06-18 10:45:00 05:30 821.20 821.85 811.95 812.75 69394
2021-06-18 11:00:00 05:30 812.75 822.00 812.60 821.30 70689
2021-06-18 11:15:00 05:30 821.30 822.35 819.55 822.00 25185
2021-06-18 11:30:00 05:30 821.90 822.00 820.05 820.90 17239
2021-06-18 11:45:00 05:30 821.15 822.85 820.30 822.50 17539
2021-06-18 12:00:00 05:30 822.50 822.50 820.30 822.10 19281
2021-06-18 12:15:00 05:30 822.10 824.15 821.00 823.55 24699
2021-06-18 12:30:00 05:30 823.50 823.85 821.30 823.55 14806
2021-06-18 12:45:00 05:30 823.20 823.80 821.40 822.25 9944
2021-06-18 13:00:00 05:30 822.25 822.25 818.60 820.45 15122
2021-06-18 13:15:00 05:30 820.45 821.80 817.00 817.60 12859
2021-06-18 13:30:00 05:30 817.60 819.75 817.20 819.70 14129
2021-06-18 13:45:00 05:30 819.70 821.65 819.00 821.00 9999
2021-06-18 14:00:00 05:30 821.50 822.25 820.50 820.65 10744
2021-06-18 14:15:00 05:30 820.65 822.00 820.45 822.00 11466
2021-06-18 14:30:00 05:30 822.00 823.50 821.25 822.80 14608
2021-06-18 14:45:00 05:30 822.75 827.00 822.25 824.35 36461
2021-06-18 15:00:00 05:30 824.35 826.85 822.00 825.70 127325
2021-06-18 15:15:00 05:30 826.20 828.60 825.05 826.00 119421
As observed the first candle starts at 9:15 and the last candle ends at 3:15 in 15 minutes time frame.
I am trying to resample this data to 75 minutes time.
I tried below code
df=df.resample(rule='75T', closed='left', label='left',origin=df.index.min()).agg(OrderedDict([('open', 'first'),('high', 'max'),('low', 'min'),('close', 'last'),('volume', 'sum')])).dropna()
This does produce almost correct results, except for one issue.
open high low close volume
date
2021-06-17 09:15:00 05:30 815.55 823.75 807.05 821.00 140009
2021-06-17 10:30:00 05:30 821.00 821.00 817.20 819.40 50128
2021-06-17 11:45:00 05:30 818.90 822.90 818.15 821.50 52630
2021-06-17 13:00:00 05:30 821.50 829.95 820.80 822.20 250253
2021-06-17 14:15:00 05:30 822.70 823.95 818.00 819.65 149409
2021-06-18 09:00:00 05:30 820.50 835.00 820.50 824.45 560388
2021-06-18 10:15:00 05:30 824.45 826.30 811.95 822.00 243126
2021-06-18 11:30:00 05:30 821.90 824.15 820.05 823.55 93564
2021-06-18 12:45:00 05:30 823.20 823.80 817.00 821.00 62053
2021-06-18 14:00:00 05:30 821.50 827.00 820.45 825.70 200604
2021-06-18 15:15:00 05:30 826.20 828.60 825.05 826.00 119421
2021-06-21 08:15:00 05:30 823.80 829.70 818.90 829.00 70687
2021-06-21 09:30:00 05:30 829.50 833.95 827.55 831.40 199029
2021-06-21 10:45:00 05:30 831.95 832.50 827.60 831.70 83466
2021-06-21 12:00:00 05:30 831.70 833.15 830.00 833.00 90051
2021-06-21 13:15:00 05:30 832.80 833.20 828.10 829.25 55181
2021-06-21 14:30:00 05:30 829.10 831.45 825.65 828.50 106299
As we can see on the first day everything looks great. The second day (2021-06-18) starts at 09:00 instead of 09:15. Also even though I have not provided base data in 15 minutes time frame, the result for the 3rd day (2021-06-21) starts at 08:15.
How can I fix this? I want the candle to start at 9:15 every day and the last candle should be ended at 14:15, as the Indian market closes at 15:30 pm.
CodePudding user response:
I think the problem is the original
of resample
being the first value of your index rather than each day. To fix that you can groupby
on date and use the first index of each group:
print (df.groupby(df.index.date)
.apply(lambda d: d.resample(rule='75T', closed='left', label='left', origin=d.index.min())
.agg({'open':'first','high': 'max','low':'min','close': 'last','volume': 'sum'}).dropna()))
open high low close volume
date
2021-06-17 2021-06-17 09:15:00 05:30 815.55 823.75 807.05 821.00 140009
2021-06-17 10:30:00 05:30 821.00 821.00 817.20 819.40 50128
2021-06-17 11:45:00 05:30 818.90 822.90 818.15 821.50 52630
2021-06-17 13:00:00 05:30 821.50 829.95 820.80 822.20 250253
2021-06-17 14:15:00 05:30 822.70 823.95 818.00 819.65 149409
2021-06-18 2021-06-18 09:15:00 05:30 820.50 835.00 820.50 822.40 589792
2021-06-18 10:30:00 05:30 822.40 823.95 811.95 820.90 230961
2021-06-18 11:45:00 05:30 821.15 824.15 820.30 822.25 86269
2021-06-18 13:00:00 05:30 822.25 822.25 817.00 820.65 62853
2021-06-18 14:15:00 05:30 820.65 828.60 820.45 826.00 309281