Pandas Converts My date string value in date column to an integer-CodePudding

I have an excel file which I am using pd.read_excel() to read, inside the excel file are couple of date columns (the date data type is a string and must follow this format: dd/mm/yyyy. The problem I have is that when the excel file gets converted to a dataframe using pd.read_excel(), the values gets converted into an integer.

Does anyone know how I can maintain the value in the excel file after it has been converted to a dataframe.

Screenshot below: The columns with the date format

What the values get converted to after converting the file to a dataframe

"43800" is what the value of "Incorporation Date" got converted to.

What I have tried:

for column in columns_with_date_string:
    client_entity_df[column] = pd.to_datetime(
        client_entity_df[column].astype(int)
    )
    client_entity_df[column] = client_entity_df[column].dt.strftime('%d/%m/%Y')

This approach returned the values as "01/01/1970", instead of the dates specified

TLDR:

I basically want to maintain the value of my date columns (12/11/2022) in my excel file where the format is "dd/mm/yyy" when the excel file gets converted to a dataframe, pandas currently changes the values to an integer (which I assume is an epoch) when it converts the file to an integer.

CodePudding user response：

You can use:

df_excel = pd.read_excel(file, dtype=object)

All columns became a object type and after that you can convert in another type if you need:

pd.to_datetime(df_excel['column_name'])

CodePudding user response：

So I got a solution to it:

for column in columns_with_date_string:
    df[column] = df[column].apply(
        lambda x: datetime.fromordinal(datetime(1900, 1, 1).toordinal()   int(x) - 2)
    )
    df[column] = df[column].dt.strftime('%m/%d/%Y')

This SO answer was helpful.