I have a dataframe that just contains times in a string format (dtype: object). What I wanted to know is how can I create a new column that contains the time interval that the row coincides in? I've provided some sample data below:
Loss Time
0 02:10
1 22:20
2 20:00
3 01:20
4 12:36
I want to create a column that contains the interval 10AM-4PM, 4PM-8PM, 8PM-6AM and 6AM-10AM. I've tried multiple different ways but haven't found any success. the pandas between_time function doesn't work as it needs the index as a datetime index which I can't do and I've tried looking at strptime but had no success there. Any ideas would be much appreciated.
CodePudding user response:
simple solution using cut
Input:
TESTDATA = StringIO("""Loss_Time
02:10
22:20
20:00
01:20
12:36
05:30
23:59
5:00
9:30
11:30
14:30
16:30
21:20
""")
df = pd.read_csv(TESTDATA, sep=";")
code:
df.Loss_Time = pd.to_datetime(df.Loss_Time).dt.strftime('%H:%M')
bins = [0,6,10,16,20,24]
labels = ['8PM-6AM','6AM-10AM','10AM-4PM','4PM-8PM','8PM-6AM']
df['Time Bin'] = pd.cut(pd.to_datetime(df.Loss_Time).dt.hour, bins, labels=labels, right=False,ordered=False)
Prints:
Loss_Time Time Bin
0 02:10 8PM-6AM
1 22:20 8PM-6AM
2 20:00 8PM-6AM
3 01:20 8PM-6AM
4 12:36 10AM-4PM
5 05:30 8PM-6AM
6 23:59 8PM-6AM
7 05:00 8PM-6AM
8 09:30 6AM-10AM
9 11:30 10AM-4PM
10 14:30 10AM-4PM
11 16:30 4PM-8PM
12 21:20 8PM-6AM
CodePudding user response:
Just parse it as datetime and define the brackets you want.
from datetime import datetime
import pandas as pd
foo = pd.Series(['02:10', '22:20','20:00','01:20', '12:36'])
foo_df = pd.DataFrame(foo)
def time_into_bracket(time_str):
time = datetime.strptime(time_str, "%H:%M").time()
bracket1 = datetime.strptime('00:00', "%H:%M").time()
bracket2 = datetime.strptime('08:00', "%H:%M").time()
bracket3 = datetime.strptime('16:00', "%H:%M").time()
if time >= bracket1 and time < bracket2:
return '00:00 - 08:00'
elif time >= bracket2 and time < bracket3:
return '08:00 - 16:00'
else:
return '16:00 - 24:00'
foo_df['time_bracket'] = foo_df[0].apply(time_into_bracket)
Result:
0 00:00 - 08:00
1 16:00 - 24:00
2 16:00 - 24:00
3 00:00 - 08:00
4 08:00 - 16:00
CodePudding user response:
df['Loss Time'] = pd.to_datetime(df['Loss Time'])
>>> df
Loss Time
0 2021-12-17 02:10:00
1 2021-12-17 22:20:00
2 2021-12-17 20:00:00
3 2021-12-17 01:20:00
4 2021-12-17 12:36:00
>>>
>>> import time
>>> df['Loss Time'] = [time.time() for time in df['Loss Time']]
>>>
>>> df
Loss Time
0 02:10:00
1 22:20:00
2 20:00:00
3 01:20:00
4 12:36:00
>>>
By doing this you can convert those strings which contain time to time objects. Then, you can just filter and compare them to different times you want (to check if the time is between two another times).