Home > Mobile >  How can i turn time data to time format when it's original format (string) looks like this?
How can i turn time data to time format when it's original format (string) looks like this?

Time:10-27

1164570    1H45M
1421781        0
458245        7H
2714970       6H
2956491        0
89924          0
580685        3H
1330835    1H45M
599197     6H30M
541245        7H
2257962        0
1418006    1H15M
407282     5H30M
804217     7H30M
1221037    6H30M
1461747        0
148837        3H
168789        2H
2245125       5H
2079324        0
2014516        0
2272373      45M
470768         0
585334     2H45M
163046     2H30M
(df.sample)

So I have hour and minute data in string format and need to transform it to datetime or to numeral format, (meaning 1h 30min to looking like 1,5 etc.)

I tried to use pandas datetime library but as i expected, it doesn't understand this kind of string format.

diary_test['test'] = pd.to_datetime(diary_test['time'],format= '%H:%M' ).dt.time

CodePudding user response:

For numeric get integers before H and M and sum with divide 60 for minutes:

h = df['time'].str.extract(r'(\d )H', expand=False).astype(float)
m = df['time'].str.extract(r'(\d )M', expand=False).astype(float).div(60)

df['numeric'] = h.add(m, fill_value=0)
df['times'] = pd.to_datetime(h.add(m, fill_value=0).mul(3600), unit='s').dt.time
print (df)
          time  numeric     times
1164570  1H45M     1.75  01:45:00
1421781      0      NaN       NaT
458245      7H     7.00  07:00:00
2714970     6H     6.00  06:00:00
2956491      0      NaN       NaT
89924        0      NaN       NaT
580685      3H     3.00  03:00:00
1330835  1H45M     1.75  01:45:00
599197   6H30M     6.50  06:30:00
541245      7H     7.00  07:00:00
2257962      0      NaN       NaT
1418006  1H15M     1.25  01:15:00
407282   5H30M     5.50  05:30:00
804217   7H30M     7.50  07:30:00
1221037  6H30M     6.50  06:30:00
1461747      0      NaN       NaT
148837      3H     3.00  03:00:00
168789      2H     2.00  02:00:00
2245125     5H     5.00  05:00:00
2079324      0      NaN       NaT
2014516      0      NaN       NaT
2272373    45M     0.75  00:45:00
470768       0      NaN       NaT
585334   2H45M     2.75  02:45:00
163046   2H30M     2.50  02:30:00

CodePudding user response:

Use directly to_timedelta (after conversion to lowercase):

diary_test['hours'] = (pd.to_timedelta(diary_test['time'].str.lower())
                         .dt.total_seconds().div(3600)
                       )

# if you also need as time delta
diary_test['timedelta'] = pd.to_timedelta(diary_test['time'].str.lower())
diary_test['hours'] = diary_test['timedelta'].dt.total_seconds().div(3600)

output:

          time  hours       timedelta
1164570  1H45M   1.75 0 days 01:45:00
1421781      0   0.00 0 days 00:00:00
458245      7H   7.00 0 days 07:00:00
2714970     6H   6.00 0 days 06:00:00
2956491      0   0.00 0 days 00:00:00
89924        0   0.00 0 days 00:00:00
580685      3H   3.00 0 days 03:00:00
1330835  1H45M   1.75 0 days 01:45:00
599197   6H30M   6.50 0 days 06:30:00
541245      7H   7.00 0 days 07:00:00
2257962      0   0.00 0 days 00:00:00
1418006  1H15M   1.25 0 days 01:15:00
407282   5H30M   5.50 0 days 05:30:00
804217   7H30M   7.50 0 days 07:30:00
1221037  6H30M   6.50 0 days 06:30:00
1461747      0   0.00 0 days 00:00:00
148837      3H   3.00 0 days 03:00:00
168789      2H   2.00 0 days 02:00:00
2245125     5H   5.00 0 days 05:00:00
2079324      0   0.00 0 days 00:00:00
2014516      0   0.00 0 days 00:00:00
2272373    45M   0.75 0 days 00:45:00
470768       0   0.00 0 days 00:00:00
585334   2H45M   2.75 0 days 02:45:00
163046   2H30M   2.50 0 days 02:30:00
  • Related