I am trying to capture the frequency of hours between two timestamps in a dataframe. For example, one row of data has '2022-01-01 00:35:00' and '2022-01-01 05:29:47'. I would like for frequency to be attributed to Hours 0, 1, 2, 3, 4, and 5.
Start Time | End Time |
---|---|
2022-01-01 00:35:00 | 2022-01-01 05:29:47 |
2022-01-01 00:55:00 | 2022-01-01 05:00:17 |
2022-01-01 01:35:00 | 2022-01-01 06:26:00 |
2022-01-01 02:29:00 | 2022-01-01 04:25:17 |
I have been trying to capture the time delta between the two but have not been able to figure out counting the frequency of hours.
CodePudding user response:
You can extract the hours and then calculate the delta:
import datetime
df['start_hour'] = [datetime.datetime.strptime(i, "%Y-%m-%d %H:%M:%S").hour for i in df['Start Time']]
df['end_hour'] = [datetime.datetime.strptime(i, "%Y-%m-%d %H:%M:%S").hour for i in df['End Time']]
df['delta'] = df['end_hour'] - df['start_hour']
CodePudding user response:
Try this:
df['freq'] = df.apply(lambda x:[i x['Start Time'].hour for i in list(range(x['End Time'].hour - x['Start Time'].hour)], axis=1)