Home > Blockchain >  Specify header name dtype when reading csv pandas
Specify header name dtype when reading csv pandas

Time:11-01

I have a dataframe where all the column names are dates. When I use read_csv, pandas reads these column names as strings. Is there a way to specify that I want the column names to be datetime objects. I need this to be done as part of the read_csv call, rather than an additional line afterwards.

For example, the csv file looks something like

df = pd.DataFrame({'2022-10-25': [0.2, 1.5, 1.4], '2022-10-26': [1.1, 2.3, 0.2]})

and when I call read_csv, I want the column headers to be datetime objects, rather than strings.

CodePudding user response:

either you define the whole column as a datetime or as an integer... there is no way you can define the header as a datetime and the values as a different thing.

On the other hand you cannot define your columns as datetime, because the values are just integers... I think you are approaching your problem in a wrong way.

CodePudding user response:

Create the dummy dataset:

df = pd.DataFrame({'2022-10-25': [0, 1, 1], '2022-10-26': [1, 1, 0]})

Check the initial columns index:

df.columns

Output:

enter image description here

You can run the following code just after the .read_csv()

from datetime import datetime
datetime_columns = [datetime.strptime(col, '%Y-%m-%d') for col in df.columns]
df.columns = pd.Index(datetime_columns, dtype='datetime64[ns]')

Finally we check the new index:

df.columns

Output:

enter image description here

  • Related