I am using data from yfinance which returns a pandas Data-Frame.
Volume
Datetime
2021-09-13 09:30:00-04:00 951104
2021-09-13 09:35:00-04:00 408357
2021-09-13 09:40:00-04:00 498055
2021-09-13 09:45:00-04:00 466363
2021-09-13 09:50:00-04:00 315385
2021-12-06 15:35:00-05:00 200748
2021-12-06 15:40:00-05:00 336136
2021-12-06 15:45:00-05:00 473106
2021-12-06 15:50:00-05:00 705082
2021-12-06 15:55:00-05:00 1249763
There are 5 minute intra-day intervals in the data-frame. I want to resample to daily data and get the idxmax of the maximum volume for that day.
df.resample("B")["Volume"].idxmax()
Returns an error:
ValueError: attempt to get argmax of an empty sequence
I used B(business-days) as the resampling period, so there shouldn't be any empty sequences.
I should say .max() works fine.
Also using .agg as was suggested in another question returns an error:
df["Volume"].resample("B").agg(lambda x : np.nan if x.count() == 0 else x.idxmax())
error:
IndexError: index 77 is out of bounds for axis 0 with size 0
CodePudding user response:
For me working test if all NaN
s per group in if-else
:
df = df.resample("B")["Volume"].agg(lambda x: np.nan if x.isna().all() else x.idxmax())
CodePudding user response:
You can use groupby
as an alternative of resample
:
>>> df.groupby(df.index.normalize())['Volume'].agg(Datetime='idxmax', Volume='max')
Datetime Volume
Datetime
2021-09-13 2021-09-13 09:30:00 951104
2021-12-06 2021-12-06 15:55:00 1249763