Home > Net >  How to separate the data_time colum by days from a dataframe
How to separate the data_time colum by days from a dataframe

Time:11-30

I have a dataframe and I need to find the most acess hour from the day. I think i need to do some for loops to store the values and after find the most acess hour.

My code until now is:

df['date_time'] = pd.to_datetime(df['date_time'])

This me return:

0     2022-11-24 19:18:37
1     2022-11-25 00:45:35
2     2022-11-25 00:48:01
3     2022-11-25 00:59:38
4     2022-11-25 01:01:07
              ...
890   2022-11-29 20:55:13
891   2022-11-29 20:55:33
892   2022-11-29 20:56:30
893   2022-11-29 20:57:01
894   2022-11-29 21:06:27
Name: date_time, Length: 895, dtype: datetime64[ns]

This dataframe have 7 days of data, I need find the most acess hour for every day, and of the week. I tried using some for's loop but i dont know much about the Pandas. I will learn more the documentation but if someone knows how to solve this problem....

Thankssss

CodePudding user response:

I would suggest to group by day and take mode on hours:

df['date_time'] = pd.to_datetime(df['date_time'])
df['date'] = df['date_time'].dt.day
df['hour'] = df['date_time'].dt.hour
df_groupped = df.groupby(df['date'])['hour'].agg(pd.Series.mode)

CodePudding user response:

You can use pandas.DataFrame.groupby with .dt accessors.

Try this :

df_day = (
            df.assign(hour= "Hour "   df['date_time'].dt.hour.astype(str),
                      week= "Week"   df['date_time'].dt.isocalendar().week.astype(str))
              .groupby("week", as_index=False).agg(MostAccesHour= ("hour", lambda x: x.value_counts().index[0]))
        )
    
df_week= (
            df.assign(hour= "Hour "   df['date_time'].dt.hour.astype(str),
                      day=  df['date_time'].dt.date)
              .groupby("day", as_index=False).agg(MostAccesHour= ("hour", lambda x: x.value_counts().index[0]))
         )

# Output :

print(df_week)

          day MostAccesHour
0  2022-11-24       Hour 19
1  2022-11-25        Hour 0
2  2022-11-29       Hour 20

print(df_day)

     week MostAccesHour
0  Week47        Hour 0
1  Week48       Hour 20
  • Related