I have a dataframe with variable start time values as 0 days 04:52:00
and end time as 0 days 08:54:00
and a third variable AT.
ST ET AT
0 days 04:52:00 0 days 08:54:00 1198
0 days 04:54:00 0 days 08:59:00 1195
0 days 04:56:00 0 days 10:16:00 1120
0 days 04:57:00 1 days 01:33:00 204
0 days 04:57:00 0 days 09:15:00 1182
0 days 05:02:00 0 days 08:53:00 1209
0 days 05:04:00 0 days 20:23:00 521
The datatypes of the three variables on enquiry obtained is pandas.core.series.Series
type(df['ST'])
type(df['ET'])
type(df['AT'])
df.to_dict()
{'ST': {0: '0 days 04:52:00',
1: '0 days 04:54:00',
2: '0 days 04:56:00',
3: '0 days 04:57:00',
4: '0 days 04:57:00',
5: '0 days 05:02:00',
6: '0 days 05:04:00'},
'ET': {0: '0 days 08:54:00',
1: '0 days 08:59:00',
2: '0 days 10:16:00',
3: '1 days 01:33:00',
4: '0 days 09:15:00',
5: '0 days 08:53:00',
6: '0 days 20:23:00'},
'AT': {0: 1198, 1: 1195, 2: 1120, 3: 204, 4: 1182, 5: 1209, 6: 521}}
I wanted to extract the hours, minutes, and seconds value from ST and ET and use it in datetime format.
df['hst']= df['ST'].dt.components['hours'].astype(int)
df['mst']= df['ST'].dt.components['minutes'].astype(int)
df['sst']= df['ST'].dt.components['seconds'].astype(int)
df['het']= df['ET'].dt.components['hours'].astype(int)
df['met']= df['ET'].dt.components['minutes'].astype(int)
df['set']= df['ET'].dt.components['seconds'].astype(int)
However, even after using .astype(int)
the datatype of df['hst'], df['mst'], df['sst'], df['het'], df['met'], and df['set']
is still pandas.core.series.Series
I receive the error on executing the code below:
from datetime import datetime, timedelta
start = datetime(2021,7,11,df['hst'],df['mst'],df['sst'])
Error: cannot convert the series to <class 'int'>
CodePudding user response:
you can add the durations (pd.Series of type timedelta) to a reference date to get datetime columns:
# make sure the data type of the elements is timedelta:
df['ST'] = pd.to_timedelta(df['ST'])
df['ET'] = pd.to_timedelta(df['ET'])
# we need a reference date to which we can add the durations
ref_date = pd.Timestamp("2021-07-11")
df['start'] = ref_date df['ST']
df['end'] = ref_date df['ET']
df.head()
ST ET AT start end
0 0 days 04:52:00 0 days 08:54:00 1198 2021-07-11 04:52:00 2021-07-11 08:54:00
1 0 days 04:54:00 0 days 08:59:00 1195 2021-07-11 04:54:00 2021-07-11 08:59:00
2 0 days 04:56:00 0 days 10:16:00 1120 2021-07-11 04:56:00 2021-07-11 10:16:00
3 0 days 04:57:00 1 days 01:33:00 204 2021-07-11 04:57:00 2021-07-12 01:33:00
4 0 days 04:57:00 0 days 09:15:00 1182 2021-07-11 04:57:00 2021-07-11 09:15:00