Home > OS >  convert string as 'hours' and 'mins' into minutes
convert string as 'hours' and 'mins' into minutes

Time:12-11

I have a column in my dataframe df:

Time
2 hours 3 mins
5 hours 10 mins
1 hours 40 mins
10 mins
4 hours
6 hours 0 mins

I want to create a new column in df 'Minutes' that converts this column over to minutes

Minutes
123
310
100
10
240
360

Is there a python function to do this? What I have tried is:

df['Minutes'] = pd.eval(
    df['Time'].replace(['hours?', 'mins'], ['*60 ', ''], regex=True))

CodePudding user response:

Here is ugly bug pd.eval processing only less like 100 rows, so after stripping is called pd.eval in Series.apply for prevent it:

df['Minutes'] = (df['Time'].replace(['hours?', 'mins'], ['*60 ', ''], regex=True)
                           .str.strip(' ')
                           .apply(pd.eval))

print (df)
              Time  Minutes
0   2 hours 3 mins      123
1  5 hours 10 mins      310
2  1 hours 40 mins      100
3          10 mins       10
4          4 hours      240
5   6 hours 0 mins      360

#verify for 120 rows
df = pd.concat([df] * 20, ignore_index=True)


df['Minutes1'] = pd.eval(
    df['Time'].replace(['hours?', 'mins'], ['*60 ', ''], regex=True).str.strip(' '))

print (df)

ValueError: unknown type object

Another solution with Series.str.extract and Series.add:

h = df['Time'].str.extract('(\d )\s hours').astype(float).mul(60)
m = df['Time'].str.extract('(\d )\s mins').astype(float)

df['Minutes']  = h.add(m, fill_value=0).astype(int)
print (df)
              Time  Minutes
0   2 hours 3 mins      123
1  5 hours 10 mins      310
2  1 hours 40 mins      100
3          10 mins       10
4          4 hours      240
5   6 hours 0 mins      360

CodePudding user response:

jezrael's answer is excellent, but I spent quite some time working on this so i figured i'll post it.

You can use a regex to capture 'hours' and 'minutes' from your column, and then assign back to a new column after applying the logical mathematical operation to convert to minutes:

r = "(?:(\d ) hours ?)?(?:(\d ) mins)?"
hours = df.Time.str.extract(r)[0].astype(float).fillna(0) * 60
minutes = df.Time.str.extract(r)[1].astype(float).fillna(0)

df['minutes'] = hours   minutes

print(df)

              Time  minutes
0   2 hours 3 mins    123.0
1  5 hours 10 mins    310.0
2  1 hours 40 mins    100.0
3          10 mins     10.0
4          4 hours    240.0
5   6 hours 0 mins    360.0

I enjoy using https://regexr.com/ to test my regex

  • Related