Home > Net >  How to group by time-interval from bottom to top using Pandas resample functionality?
How to group by time-interval from bottom to top using Pandas resample functionality?

Time:05-17

I am working with historic data of some stocks. I want to group data by certain time intervals (like 1hr, 3days, etc). Pandas gives amazing functionality of doing this with very less efforts using resampling. But it happens from top-to-bottom (below image). Like -

With interval = 5m

Group 1 => 9:30 - 9:35
Group 2 => 9:35 - 9:40
Group 3 => 9:40 - 9:45

enter image description here

Here, I want to group from bottom-to-top, like -

With interval = 5m

Group 1 => 9:45 - 9:40
Group 2 => 9:40 - 9:35
Group 3 => 9:35 - 9:30

How can I do this with pandas resampling? If there is another way of doing this please mention it as well. Thanks :)

EDIT: I want something like this out of above image data -

5-min groups                     open    max_high     max_low       close   sum_volume
2022-05-05 09:45:00-04:00  162.750000  162.750000  162.529999  162.540100   338003
2022-05-05 09:40:00-04:00  163.000000  163.440002  163.000000  163.220001   419992 
2022-05-05 09:35:00-04:00  163.500000  163.535004  163.500000  163.535004   366042
2022-05-05 09:30:00-04:00  163.850006  163.989899  163.509995  163.649994  2720494

CodePudding user response:

Maybe you can use the iloc to reverse after resample? I'm not sure if that hinders your further calculations, but it can resample and reverse the set.

Since I do not have access to your exact sample data

Here's how I am testing it:

import yfinance as yf
import pandas as pd
import numpy as np

df = yf.download(tickers = 'BTC-USD',
                   start = '2022-05-16',
                   end = '2022-05-17',
                   interval = '1m',
                   group_by = 'ticker',
                   auto_adjust = True).reset_index()

df_1min = df.iloc[110:130,:] #sample timeframe extracted
df_1min.head()

This results in the 1 min df:

1min df

You can then apply resample and the iloc:

conversion = {'Open' : 'first', 
              'High' : 'max', 
              'Low' : 'min', 
              'Close' : 'last', 
              'Volume' : 'sum'}

df_1min = df_1min.set_index('Datetime') 

df_5min = df_1min.resample('5T').agg(conversion)

df_5min.iloc[::-1].reset_index() #reverse

This results in a reversed df:

reversed and resampled

  • Related