I have a dataframe with a list of time value as object and needed to convert them to datetime
, the issue is, they are not on the same format so when I try:
df['Total call time'] = pd.to_datetime(df['Total call time'], format='%H:%M:%S')
it gives me an error
ValueError: time data '3:22' does not match format '%H:%M:%S' (match)
or if use this code
df['Total call time'] = pd.to_datetime(df['Total call time'], format='%H:%M')
I get this error
ValueError: unconverted data remains: :58
These are the values on my data
Total call time
2:04:07
3:22:41
2:30:41
2:19:06
1:45:55
1:30:08
1:32:15
1:43:28
**45:48**
1:41:40
5:08:37
**3:22**
4:29:05
2:47:25
2:39:29
2:29:32
2:09:52
3:31:57
2:27:58
2:34:28
3:14:10
2:12:10
2:46:58
CodePudding user response:
times = """\
2:04:07
3:22:41
2:30:41
2:19:06
1:45:55
1:30:08
1:32:15
1:43:28
45:48
1:41:40
5:08:37
3:22
4:29:05
2:47:25
2:39:29
2:29:32
2:09:52
3:31:57
2:27:58
2:34:28
3:14:10
2:12:10
2:46:58""".split()
import pandas as pd
df = pd.DataFrame(times, columns=['elapsed'])
def pad(s):
if len(s) == 4:
return '00:0' s
elif len(s) == 5:
return '00:' s
return s
print(pd.to_timedelta(df['elapsed'].apply(pad)))
Output:
0 0 days 02:04:07
1 0 days 03:22:41
2 0 days 02:30:41
3 0 days 02:19:06
4 0 days 01:45:55
5 0 days 01:30:08
6 0 days 01:32:15
7 0 days 01:43:28
8 0 days 00:45:48
9 0 days 01:41:40
10 0 days 05:08:37
11 0 days 00:03:22
12 0 days 04:29:05
13 0 days 02:47:25
14 0 days 02:39:29
15 0 days 02:29:32
16 0 days 02:09:52
17 0 days 03:31:57
18 0 days 02:27:58
19 0 days 02:34:28
20 0 days 03:14:10
21 0 days 02:12:10
22 0 days 02:46:58
Name: elapsed, dtype: timedelta64[ns]
CodePudding user response:
Alternatively to grovina's answer ... instead of using apply you can directly use the dt
accessor.
Here's a sample:
>>> data = [['2017-12-01'], ['2017-12-
30'],['2018-01-01']]
>>> df = pd.DataFrame(data=data,
columns=['date'])
>>> df
date
0 2017-12-01
1 2017-12-30
2 2018-01-01
>>> df.date
0 2017-12-01
1 2017-12-30
2 2018-01-01
Name: date, dtype: object
Note how df.date is an object? Let's turn it into a date like you want
>>> df.date = pd.to_datetime(df.date)
>>> df.date
0 2017-12-01
1 2017-12-30
2 2018-01-01
Name: date, dtype: datetime64[ns]
The format you want is for string formatting. I don't think you'll be able to convert the actual datetime64 to look like that format. For now, let's make a newly formatted string version of your date in a separate column
>>> df['new_formatted_date'] =
df.date.dt.strftime('%d/%m/%y %H:%M')
>>> df.new_formatted_date
0 01/12/17 00:00
1 30/12/17 00:00
2 01/01/18 00:00
Name: new_formatted_date, dtype: object
Finally, since the df.date column is now of date datetime64... you can use the dt accessor right on it. No need to use apply
>>> df['month'] = df.date.dt.month
>>> df['day'] = df.date.dt.day
>>> df['year'] = df.date.dt.year
>>> df['hour'] = df.date.dt.hour
>>> df['minute'] = df.date.dt.minute
>>> df
date new_formatted_date month day
year hour minute
0 2017-12-01 01/12/17 00:00 12
1 2017 0 0
1 2017-12-30 30/12/17 00:00 12
30 2017 0 0
2 2018-01-01 01/01/18 00:00