I have a Pandas DataFrame(data) with a ['Duration'] column as 'object' type that has time durations in format: 'H:%M:%S' such as '1:47:54' with 7 characters, but also time durations that only have minutes '%M:%S' such as '45:20' or '03:20' with 5 characters.
I want to convert this column to Pandas Time Delta.
However, it seems that I need to have all of my rows in the column with the same length of characters because when I try to do:
data['Duration'] = pd.to_timedelta(data['Duration'])
I get: "ValueError: expected hh:mm:ss format"
Do you have any idea how to solve this problem?
Is it possible to search for the rows in a column that contain ':' only once?
And then add to these rows a string with '0:' ?
Maybe with:
data['Duration'] = '0:' data['Duration'].astype(str)
Thank you in advance!
CodePudding user response:
You can use df.where
:
df['Timedelta'] = pd.to_timedelta(df['Duration'].where(df['Duration'].str.len() >= 7,
other='0:' df['Duration']))
print(df)
# Output
Duration Timedelta
0 1:47:54 0 days 01:47:54
1 45:20 0 days 00:45:20
2 03:20 0 days 00:03:20