I have a dataframe and I need to find the most acess hour from the day. I think i need to do some for loops to store the values and after find the most acess hour.
My code until now is:
df['date_time'] = pd.to_datetime(df['date_time'])
This me return:
0 2022-11-24 19:18:37
1 2022-11-25 00:45:35
2 2022-11-25 00:48:01
3 2022-11-25 00:59:38
4 2022-11-25 01:01:07
...
890 2022-11-29 20:55:13
891 2022-11-29 20:55:33
892 2022-11-29 20:56:30
893 2022-11-29 20:57:01
894 2022-11-29 21:06:27
Name: date_time, Length: 895, dtype: datetime64[ns]
This dataframe have 7 days of data, I need find the most acess hour for every day, and of the week. I tried using some for's loop but i dont know much about the Pandas. I will learn more the documentation but if someone knows how to solve this problem....
Thankssss
CodePudding user response:
I would suggest to group by day and take mode on hours:
df['date_time'] = pd.to_datetime(df['date_time'])
df['date'] = df['date_time'].dt.day
df['hour'] = df['date_time'].dt.hour
df_groupped = df.groupby(df['date'])['hour'].agg(pd.Series.mode)
CodePudding user response:
You can use pandas.DataFrame.groupby
with .dt
accessors.
Try this :
df_day = (
df.assign(hour= "Hour " df['date_time'].dt.hour.astype(str),
week= "Week" df['date_time'].dt.isocalendar().week.astype(str))
.groupby("week", as_index=False).agg(MostAccesHour= ("hour", lambda x: x.value_counts().index[0]))
)
df_week= (
df.assign(hour= "Hour " df['date_time'].dt.hour.astype(str),
day= df['date_time'].dt.date)
.groupby("day", as_index=False).agg(MostAccesHour= ("hour", lambda x: x.value_counts().index[0]))
)
# Output :
print(df_week)
day MostAccesHour
0 2022-11-24 Hour 19
1 2022-11-25 Hour 0
2 2022-11-29 Hour 20
print(df_day)
week MostAccesHour
0 Week47 Hour 0
1 Week48 Hour 20