Home > Software engineering >  Why the difference in Date format when converting to Datetime in Pandas? [duplicate]
Why the difference in Date format when converting to Datetime in Pandas? [duplicate]

Time:10-04

I have a dataframe such as follows.

data = [['250635', 'Comcast Cable Internet Speeds', '22-04-15', '22-Apr-15', '3:53:50 PM'],
        ['223441', 'Payment disappear - service got disconnected', '04-08-15', '04-Aug-15', '10:22:56 AM'],
        ['242732', 'Speed and Service', '18-04-15', '18-Apr-15', '9:55:47 AM'],
        ['277946', 'Comcast Imposed a New Usage Cap of 300GB that punishes streaming.', '05-07-15', '05-Jul-15', '11:59:35 AM']]

df = pd.DataFrame(data, columns = ['Ticket #', 'Customer Complaint', 'Date', 'Date_month_year', 'Time'])

The dataframe has a Date column which is in the format of dd-mm-yy.

I have converted the object column to datetime using :

df['Date'] = pd.to_datetime(df['Date'])

However, this yields a result which have their months and days randomly switched places, such as this,

Date Date_month_year
2015-04-22 22-Apr-15
2015-04-08 04-Aug-15
2015-04-18 18-Apr-15
2015-05-07 05-Jul-15

For example, the 1st and 3rd entries are in the correct YYYY-mm-dd order, but the 2nd and 4th have their months and days wrong and are in the order YYYY-dd-mm.

Please help.

Thanks in advance.

CodePudding user response:

to_datetime accepts a format string, you can use it in your case as:

pd.to_datetime(df['Date'], format='%d-%m-%y')

CodePudding user response:

A possible duplicate of this question: https://stackoverflow.com/a/50372326 There is an easy working answer by @Scott Boston which usually works for me. Adding arg- dayfirst=True could probably solve your ptoblem

  • Related