I used to have a list and only needed to extract the max values in column 33 every day using below code and then export the data.
df_= pd.read_excel (r'file_location.xlsx')
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
df_new = (df.groupby(pd.Grouper(key="Date",freq="D"))
.agg({df.columns[33]: np.max})
.reset_index())
Now I have a new task to extract the top 3 valus in the same column everyday. I tried below code but doesn't work.
Any idea?
df_= pd.read_excel (r'file_location.xlsx')
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
df_new = (df.groupby(pd.Grouper(key="Date",freq="D"))
.agg({df.columns[33]: np.head(3)})
.reset_index())
CodePudding user response:
You need specify column after groupby
and call GroupBy.head
without agg
:
df_e_new = df.groupby(pd.Grouper(key="Date",freq="D"))[df.columns[33]].head(3)
Or use SeriesGroupBy.nlargest
for top3 sorted values:
df_e_new = df.groupby(pd.Grouper(key="Date",freq="D"))[df.columns[33]].nlargest(3)
For sum top3 values use lambda function:
df_e_new = (df.groupby(pd.Grouper(key="Date",freq="D"))[df.columns[33]]
.agg(lambda x: x.nlargest(3).sum())
.reset_index(name='top3sum'))