I have a dataframe an index column as:
2022-10-01
2022-10-02
2022-10-05
2022-11-03
2022-11-07
2022-12-10
2022-12-11
I would like to have a result set with:
2022-10-01
2022-11-03
2022-12-10
CodePudding user response:
Assuming these are the index values, you can Serieify them, group by their month, and take the first instance per month:
In [355]: df.index
Out[355]:
DatetimeIndex(["2022-10-01", "2022-10-02", "2022-10-05", "2022-11-03",
"2022-11-07", "2022-12-10", "2022-12-11"],
dtype="datetime64[ns]")
In [356]: pd.Index(df.index.to_series().groupby(lambda idx: idx.month).first())
Out[356]: DatetimeIndex(["2022-10-01", "2022-11-03", "2022-12-10"],
dtype='datetime64[ns]')
CodePudding user response:
Assuming "date" your column, group by monthly period and get the rows with the idxmin
:
df['date'] = pd.to_datetime(df['date'])
out = df.loc[df.groupby(df['date'].dt.to_period('M'))['date'].idxmin()]
Output:
date
0 2022-10-01
3 2022-11-03
5 2022-12-10