Home > Mobile >  Filter out data at the end of each month using pd.Period
Filter out data at the end of each month using pd.Period

Time:02-13

I have a pandas data frame and my goal is to filter out data at the end of each month (i.e. 2021-12-31, 2022-01-31, etc).

I have 2 sets of code and 1 works but the other doesn't even though they the type(criteria) is the same - string.

Works:

df_monthly = df[df.index == str(dt.date(2022,1,31))]

Does not work:

df_monthly = df[df.index == str(pd.Period(df.index, freq='M').end_time.date())]

The 2nd line returns this error: ValueError: Value must be Period, string, integer, or datetime

Anyone knows why this is the case and how do I remedy? I'm hoping to be able to proceed using pd.Period() instead of dt.date().

Many thanks!

CodePudding user response:

If the index is a DatetimeIndex, you can call .to_period followed by .to_timestamp on the index to do this:

df_monthly = df.loc[df.index == df.index.to_period('M').to_timestamp('M')]

If the index needs to be converted first, you can use pd.to_datetime (and optionally set errors='coerce' to handle invalid dates)


Another succinct option is to use offsets.MonthEnd:

df_monthly = df.loc[df.index == df.index   pd.offsets.MonthEnd(0)]
  • Related