Home > Blockchain >  Multi-format string column to be converted into date format
Multi-format string column to be converted into date format

Time:03-20

I have a column, which represents date, however is formatted as string. I can't use simple pandas.to_datetime like:

01/02/2023
Apr 02, 2016
Jun 2021
2023/12/01

I've tried to create a below formula, where I would list of potential date formats and using for loop to convert the column into desired format. However, it is obviously wrong as the column contains NONEs after applying it. Could you advise me better direction or what should I change, please?

def DateFormat(data):
    for fmt in ('%b %d, %Y', '%d/%m/%Y', '%b %Y', '%Y/%b/%d'):
        try:
            pd.to_datetime(data['date'], format=fmt)
            
        except ValueError:
            pass


data['date'] =  data.apply(DateFormat, axis = 1)

BEFORE APPLY DateFormat: | ID | Date | | --- | -------------- | | 1 | 01/02/2023 | | 2 | Apr 02, 2016 | | 3 | Jun 2021 | | 4 | 2023/12/01 |

AFTER APPLY DateFormat: | ID | Date | | --- | ----- | | 1 | None | | 2 | None | | 3 | None | | 4 | None |

CodePudding user response:

If you don't use return to return value then it uses return None at the end of function.

You should use return pd.to_datetime(...)

And if you want to return original value when it can't convert it then you need return at the end. OR you could use return to return some default value.

def DateFormat(data):
    for fmt in ('%b %d, %Y', '%d/%m/%Y', '%b %Y', '%Y/%b/%d'):
        try:
             
            return pd.to_datetime(data['date'], format=fmt)
        except ValueError:
            pass

    # return original `date` if it couldn't convert
    return data['date']

    # or return some default value
    #return datetime.datetime(1900, 1, 1)

CodePudding user response:

Why don't let Pandas infer your datetime format? and force dayfirst=True for your first date format (%d/%m/%Y)

df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
print(df)

# Output
   ID       Date
0   1 2023-02-01
1   2 2016-04-02
2   3 2021-06-01
3   4 2023-12-01
  • Related