I have the following MultiIndex
created with df.groupby(...).resample()
. It is stock market-like OHLC data grouped by an asset and then having OHLC candle time-series for this asset.
high low close ... avg_trade buys sells
pair timestamp ...
AAVE-ETH 2020-01-01 80.0 80.0 80.0 ... 1280.0 1 0
2020-01-02 96.0 96.0 96.0 ... 1120.0 1 0
ETH-USDC 2020-01-02 1600.0 1600.0 1600.0 ... 5000.0 1 0
2020-01-05 1620.0 1400.0 1400.0 ... 1125.0 1 1
The df.index
content is:
MultiIndex([('AAVE-ETH', '2020-01-01'),
('AAVE-ETH', '2020-01-02'),
('ETH-USDC', '2020-01-02'),
('ETH-USDC', '2020-01-05')],
names=['pair', 'timestamp'])
I would like to do a DataFrame.truncate()
like operation by the second index (timestamp) so that I discard all entries beyond a certain timestamp.
However the naive df.truncate(timestamp)
will give an error:
TypeError: Level type mismatch: 2020-01-04 00:00:00
Can a grouped data frame be truncated by its second index (time series) somehow?
CodePudding user response:
You can use get_level_values for "timestamp" and filter by required date literal:
# Create dataframe
df = pd.DataFrame(data=[["AAVE-ETH","2020-01-01",80.0,80.0,80.0,80.0],["AAVE-ETH","2020-01-02",96.0,96.0,96.0,96.0],["AAVE-ETH","2020-01-03",95.0,95.0,95.0,95.0],["ETH-USDC","2020-01-01",96.0,96.0,96.0,96.0],["ETH-USDC","2020-01-02",97.0,97.0,97.0,97.0],["ETH-USDC","2020-01-03",98.0,98.0,98.0,98.0]], columns=["pair","timestamp","high","low","close","open"])
df["timestamp"] = df["timestamp"].apply(pd.to_datetime)
# Apply index
index = pd.MultiIndex.from_frame(df[["pair", "timestamp"]])
df = df.set_index(index).drop(["pair", "timestamp"], axis=1)
high low close open
pair timestamp
AAVE-ETH 2020-01-01 80.0 80.0 80.0 80.0
2020-01-02 96.0 96.0 96.0 96.0
2020-01-03 95.0 95.0 95.0 95.0
ETH-USDC 2020-01-01 96.0 96.0 96.0 96.0
2020-01-02 97.0 97.0 97.0 97.0
2020-01-03 98.0 98.0 98.0 98.0
Filter by date literal (fox ex. keep entries after 2020-01-01):
# Filter by date
df = df[df.index.get_level_values("timestamp") > pd.Timestamp(2020, 1, 1)]
high low close open
pair timestamp
AAVE-ETH 2020-01-02 96.0 96.0 96.0 96.0
2020-01-03 95.0 95.0 95.0 95.0
ETH-USDC 2020-01-02 97.0 97.0 97.0 97.0
2020-01-03 98.0 98.0 98.0 98.0