I am trying to convert my time watched in a netflix show to a float so I can total it up. I cannot figure out how to convert it. I have tried many ways, including:
temp['Minutes'] = temp['Duration'].apply(lambda x: float(x))
Error: ValueError: could not convert string to float: '00:54:45'
2022-05-18 05:21:42 00:54:45 NaN Ozark: Season 4: Mud (Episode 13) NaN Amazon FTVET31DOVI2020 Smart TV00:54:50 00:54:50 US (United States) Wednesday 2022-05-18
CodePudding user response:
I believe what you are trying to do is convert the seconds that are after the 54 into the percent of the time till 60. This should give you the answer you are looking for assuming your data is consistant
data = {
'time' : ['00:54:45', '00:05:15', '01:32:00', '00:28:00']
}
df = pd.DataFrame(data)
df['time'] = df['time'].apply(lambda x : ''.join(x.split(':', 1)[1:]))
df['percent'] = df['time'].apply(lambda x : x.split(':')[1])
df['percent'] = (df['percent'].astype(int) / 60) * 100
df['percent'] = df['percent'].astype(int)
df['final'] = df['time'].apply(lambda x : str(x).split(':')[0]) '.' df['percent'].astype(str)
df
CodePudding user response:
If your goal is to convert a pd.Series containing strings to a datatype that can be summed up then working with pd.Timedeltas may help here.
Sample data:
import pandas as pd
durations = {"Duration": ['00:54:45', '00:05:15', '01:32:00', '00:28:00']}
temp = pd.DataFrame(durations)
Converting strings to Timedelta objects:
time_deltas = pd.to_timedelta(temp["Duration"])
The time_deltas pd.Series can then be summed up and then converted to minutes:
minutes = time_deltas.sum().total_seconds() / 60
minutes # 180.0
If you wanted to add the Timedeltas to the existing DataFrame:
temp["time_deltas"] = pd.to_timedelta(temp["Duration"])
minutes = temp["time_deltas"].sum().total_seconds() / 60