Pandas error when reading date from excel file. I am creating a dataframe using the following command.
df = pd.read_excel("report_file.xls", parse_dates=['operation_date'])
df.dtypes
operation_date datetime64[ns]
Everything looks good. But when analyzing the dataframe, an error was found. After the number of the day matches the number of the month, the pandas is mistaken and reverses the day and month. For example, in October data it looks like this.
45 2021-10-13 11:50:34 ... 329.97
46 2021-10-13 11:41:56 ... 323.50
47 2021-10-13 11:41:55 ... 2600.00
48 2021-10-10 02:05:13 ... 1479.45
49 2021-09-10 20:22:01 ... 40.00
50 2021-09-10 19:39:39 ... 42.64
51 2021-09-10 19:39:39 ... 350.00
52 2021-06-10 20:11:48 ... 20.00
53 2021-06-10 13:34:25 ... 1.96
You can see that after 2021-10-10 day number at the place of month.
CodePudding user response:
Try passing the date format explicitly, something like this:
pd.read_excel(
"report_file.xls",
parse_dates=['operation_date'],
date_parser=lambda x: pd.to_datetime(x, format='%Y-%m-%d %I:%M:%S')
)
CodePudding user response:
I used another way.
df = pd.read_excel(report_file)
df['operation_date'] = pd.to_datetime(df['operation_date'],dayfirst=True)
In this case operation_date datetime64[ns]
45 2021-10-13 11:50:34 ... 329.97
46 2021-10-13 11:41:56 ... 323.50
47 2021-10-13 11:41:55 ... 2600.00
48 2021-10-10 02:05:13 ... 1479.45
49 2021-10-09 20:22:01 ... 40.00
50 2021-10-09 19:39:39 ... 42.64
51 2021-10-09 19:39:39 ... 350.00
52 2021-10-06 20:11:48 ... 20.00
53 2021-10-06 13:34:25 ... 1.96
And date look correctly.