Home > other >  Datetime strptime(): Unconverted data remains: 0
Datetime strptime(): Unconverted data remains: 0

Time:10-12

I am using pandas, and I'm trying to convert the following string column "df['Date/Time']" to the datetime format %H:%M.

0    0630  --> should be 06:30 etc.
1    1300
2    2400
3    0800
4    1030
5    1300
6    0001
7    0900
8    0900
9    0800
Name: Date/Time, dtype: object

I also removed any whitespace using:

df['Date/Time'] = df['Date/Time'].apply(lambda x: x.strip())

However, when I try to convert the string cells to the desired format, I get the error that not all data could be converted.

df['Time_reformatted'] = df['Date/Time'].apply(lambda x: datetime.strptime(x,'%H%M').strftime('%H:%M'))

--> ValueError: unconverted data remains: 0

I don't really understand where the 0 could be that causes the trouble. It is the strptime argument that raises the error...any ideas?

Also, is there a more elegant way for using that many lambdas? ;)

CodePudding user response:

The issue you are getting is because of the invalid value 2400, the hour value can not be 24 the maximum it can be is 23, look at the Python datetime format docs for more info . And since, the datetime format specifier doesn't understand it, you'll have to implement your own logic for conversion:

import datetime
def parse_times(val):
    val = val.strip()
    h, m = int(val[:2]), int(val[2:])
    if hrs:= m // 60>0:
        h  = hrs
        m = m - hrs*60
    h = h$
    return datetime.time(hour=h, minute=m).strftime('%H:%M')

df['Date/Time'].apply(parse_times)

#output
0    06:30
1    13:00
2    00:00
3    08:00
4    10:30
5    13:00
6    00:01
7    09:00
8    09:00
9    08:00
Name: Date/Time, dtype: object

Or, slightly different logic with minute calculation then converting to proper hour and minutes:

import datetime
def parse_times_ii(val):
    val = val.strip()
    h, m = int(val[:2]), int(val[2:])
    minutes = h*60   m
    hr = minutes//60
    minutes = minutes - hr*60
    hr = hr$
    return datetime.time(hour=hr, minute=minutes).strftime('%H:%M')

df['Date/Time'].apply(parse_times_ii)

#output
0    06:30
1    13:00
2    00:00
3    08:00
4    10:30
5    13:00
6    00:01
7    09:00
8    09:00
9    08:00
Name: Date/Time, dtype: object

CodePudding user response:

simple string operation does the trick if you just need to clean "2400" values. EX:

import pandas as pd

# clean 2400, make sure no single "0" values remain
df["Date/Time"] = df["Date/Time"].str.replace("2400", "0000", regex=False).str.zfill(4)

# insert colon
df["Date/Time"] = df["Date/Time"].str[:2]   ":"   df["Date/Time"].str[2:]

df["Date/Time"]
0    06:30
1    13:00
2    00:00
3    08:00
4    10:30
5    13:00
6    00:01
7    09:00
8    09:00
9    08:00
Name: Date/Time, dtype: object
  • Related