Home > Mobile >  Pandas changes date format while reading csv file altough format in the file does not change
Pandas changes date format while reading csv file altough format in the file does not change

Time:02-26

I have a problem that I don't understand with reading a csv file using pandas. Pandas suddenly changes the format of the date altough the format in the file is exactly the same. Here you can see the screenshot of the file, once opened with Excel, once with Windows-Editor, and once after Pandas has read it in Sypder.

enter image description here

You can see that the date format remains the same in Excel and the CSV file but when Pandas reads it, it suddently changes it by switching the order of months and days.

Here is the code I use:

import pandas as pd

pathOfTheInputData = "C:/Users/User1/Desktop/Test_Read_Date.csv"

df = pd.read_csv(pathOfTheInputData, sep=';', header=0, low_memory=False, infer_datetime_format=True, parse_dates={'datetime':[0]}, index_col=['datetime'])

And here you can have a look at the data: Test_Data

Any idea why this is happening? I also tried it by setting infer_datetime_format=False but this leads to the same result.

CodePudding user response:

Use dayfirst=True as parameter of read_csv:

df = pd.read_csv('Test_Read_Date.csv', sep=';',
                 parse_dates=['timestamp'], dayfirst=True)

Output

>>> df

               timestamp  temperatures
0    2021-06-07 22:00:00         17.00
1    2021-06-07 22:15:00         16.88
2    2021-06-07 22:30:00         16.75
3    2021-06-07 22:45:00         16.63
4    2021-06-07 23:00:00         16.50
...                  ...           ...
9699 2021-09-16 22:45:00         13.25
9700 2021-09-16 23:00:00         13.40
9701 2021-09-16 23:15:00         13.33
9702 2021-09-16 23:30:00         13.25
9703 2021-09-16 23:45:00         13.18

[9704 rows x 2 columns]

>>> df.loc[487:488]
              timestamp  temperatures
487 2021-06-12 23:45:00         18.38
488 2021-06-13 00:00:00         18.30
  • Related