Home > front end >  Pandas Timedelta Multiple If Condition
Pandas Timedelta Multiple If Condition

Time:11-09

I would like to create a new column called time_of_day for a dataframe such as below so that 06-12:59 is morning, 13-17:59 is afternoon, 18-22:59 is evening and rest of the time is night.

id TIME
25 00:01
25 02:01
25 06:55
18 22:03
18 23:33
18 00:33

What I expect

id TIME TIME_OF_DAY
25 00:01 Night
25 02:01 Night
25 06:55 Morning
18 22:03 Evening
18 23:33 Night
18 00:33 Night

I tried the numpy vectorization with choices and conditions, but it failed. Then I tried the following:

def conditions(s):
   if (pd.to_timedelta(df['TIME']) >= pd.to_timedelta('06:00:00')) & (pd.to_timedelta(df['TIME']) < pd.to_timedelta('13:00:00')):
       return "Morning" 
   elif (pd.to_timedelta(df['TIME']) >= pd.to_timedelta('13:00:00')) & (pd.to_timedelta(df['TIME']) < pd.to_timedelta('18:00:00')):
       return "Afternoon" 
   elif (pd.to_timedelta(df['TIME']) >= pd.to_timedelta('18:00:00')) & (pd.to_timedelta(df['TIME']) < pd.to_timedelta('23:00:00')):
       return "Evening" 
   elif (pd.to_timedelta(df['TIME']) >= pd.to_timedelta('23:00:00')) & (pd.to_timedelta(df['TIME']) < pd.to_timedelta('06:00:00')):
       return "Night" 
df['TIME_OF_DAY'] = df.apply(conditions, axis=1)

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How can I overcome this issue? PS: I added .all() for the enf od each if condition, but still got the same error.

CodePudding user response:

Use pandas.cut, it will be more efficient than a function with apply:

bins = ['0', '06:00:00', '13:00:00', '18:00:00', '23:00:00', '24:00:00']
labels = ['Night', 'Morning', 'Afternoon', 'Evening', 'Night']

df['TIME_OF_DAY'] = pd.cut(
  pd.to_timedelta(df['TIME'] ':00'),
  bins=list(map(pd.Timedelta, bins)),
  labels=labels, right=False, ordered=False
)

output:

   id   TIME TIME_OF_DAY
0  25  00:01       Night
1  25  02:01       Night
2  25  06:55     Morning
3  18  22:03     Evening
4  18  23:33       Night
5  18  00:33       Night

If you need a function:

def to_day_period(s):
    bins = ['0', '06:00:00', '13:00:00', '18:00:00', '23:00:00', '24:00:00']
    labels = ['Night', 'Morning', 'Afternoon', 'Evening', 'Night']
    
    return pd.cut(
      pd.to_timedelta(s ':00'),
      bins=list(map(pd.Timedelta, bins)),
      labels=labels, right=False, ordered=False
    )

df['TIME_OF_DAY'] = to_day_period(df['TIME'])

CodePudding user response:

Assuming the sample DataFrame you mentioned in the question, you can make the following modifications to make the code work:

def conditions(t: pd.Timedelta):
   if pd.Timestamp('06:00:00') <= t < pd.Timestamp('13:00:00'):
       return "Morning" 
   elif pd.Timestamp('13:00:00') <= t < pd.Timestamp('18:00:00'):
       return "Afternoon" 
   elif pd.Timestamp('18:00:00') <= t < pd.Timestamp('23:00:00'):
       return "Evening" 
   else:
       return "Night"

df['TIME_OF_DAY'] = df["TIME"].apply(lambda s: conditions(pd.Timestamp(s)))

Output (df):

   id   TIME TIME_OF_DAY
0  25  00:01       Night
1  25  02:01       Night
2  25  06:55     Morning
3  18  22:03     Evening
4  18  23:33       Night
5  18  00:33       Night
  • Related