Home > OS >  how to get result of expanding/resample function in original dataframe using python
how to get result of expanding/resample function in original dataframe using python

Time:04-14

have a dataframe with 1 minute timestamp of open, high, low, close, volume for a token. using expanding or resample function, one can get a new dataframe based on the timeinterval. in my case its 1 day time interval.

i am looking to get the above output in the original dataframe. please assist in the same.

original dataframe:

enter image description here

desired dataframe:

enter image description here

Here "date_1d" is the time interval for my use case. i used expanding function but as the value changes in "date_1d" column, expanding function works on the whole dataframe

''' df["high_1d"] = df["high"].expanding().max()

df["low_1d"] = df["low"].expanding().min()

df["volume_1d"] = df["volume"].expanding().min() '''

then the next challenge was how to find Open and Close based on "date_1d" column

Please assist or ask more questions, if not clear on my desired output.

Fyi - data is huge for 5 years 1 minute data for 100 tokens

thanks in advance

Sukhwant

CodePudding user response:

I'm not sure if I understand it right but for me it looks like you want to groupbyeach day and calculate first last min max for them. Is the column date_1d already there ?

If not:

df["date_1d"] = df["date"].dt.strftime('%Y%m%d')

For the calculations:

df["open_1d"] = df.groupby("date_1d")['open'].transform('first')
df["high_1d"] =  df.groupby("date_1d")['high'].transform('max')
df["low_1d"] =  df.groupby("date_1d")['low'].transform('min')
df["close_1d"] = df.groupby("date_1d")['close'].transform('last')

EDIT: Have a look in your code if this works like you expect it (till we have some of your data I can only guess, sorry :D )

df['high_1d'] = df.groupby('date_1d')['high'].expanding().max().values

It groups the data per "date_1d" but in the group only consider row by row (and the above rows)

  • Related