I have a dataframe of trades, price
and size
price size time 2022-04-18 00:00:00.048541 39649.1 0.007 2022-04-18 00:00:00.064796 39649.0 0.005 2022-04-18 00:00:00.073577 39649.0 0.023 2022-04-18 00:00:00.073579 39649.0 0.011 2022-04-18 00:00:00.073617 39648.0 0.006
I would like to create OHLC minute-bars
The following doesn't work, as open
, high
, low
and close
columns don't exist:
df.resample('1t', closed='left', label='right').agg({
'open': 'first',
'high': 'max',
'low': 'min',
'close': 'last'
})
KeyError: "Column(s) ['close', 'high', 'low', 'open'] do not exist"
Specifying an existing column name, price
for each of the aggregates also doesn't work, as each subsequent aggregation overwrites the next
df.resample('1t', closed='left', label='right').agg({
'price': 'first',
'price': 'max',
'price': 'min',
'price': 'last'
})
price time 2022-04-18 00:01:00 39638.5 2022-04-18 00:02:00 39628.8 2022-04-18 00:03:00 39552.0 2022-04-18 00:04:00 39651.7
I could create 4 new columns as a copy of price, and then apply my first attempt at the aggregation
df['open'] = df['high'] = df['low'] = df['close'] = df['price']
df.resample('1t', closed='left', label='right').agg({
'open': 'first',
'high': 'max',
'low': 'min',
'close': 'last'
})
open high low close time 2022-04-18 00:01:00 39649.1 39659.4 39605.9 39638.5 2022-04-18 00:02:00 39638.5 39657.5 39622.0 39628.8 2022-04-18 00:03:00 39628.8 39643.7 39552.0 39552.0 2022-04-18 00:04:00 39552.0 39665.7 39550.0 39651.7
Whilst this works, it just feels a little clunky...
Is there a way to do my OHLC aggregation, specifying the final column label and the source column to take the input value from?
CodePudding user response:
Pandas already provides resample(...).ohlc
method out of the box to calculate open, high, low and close values
df.resample('1t', closed='left', label='right')['price'].ohlc()
open high low close
time
2022-04-18 00:01:00 39649.1 39649.1 39648.0 39648.0