Home > Enterprise >  Change number to datetime for whole column in dataframe (ddf) in Pandas
Change number to datetime for whole column in dataframe (ddf) in Pandas

Time:02-02

I have an Excel .xlsb sheet with data, some columns have number as output data, other columns should have dates as output. After uploading the data in Python, some columns have a number in stead of date. How can I change the format of the number in that specific column to a date? I use Pandas and ddf

The output of the dataframe of column date of birth ('dob_l1') shows '12150', which should be date '6-4-1933'.

I tried to solve this, but unfortunately I only managed to get the date '2050-01-12' which is incorrect. I used code 'ddf['nwdob_l1'] = pd.to_datetime(ddf['dob_l1'], format='%d%m%y',errors='coerce')'

Who can help me. I was happy to received some good feedback from joe90. He showed me a function that could help for singular dates:

import datetime

def xldate2date(xl): # valid for dates from 1900-03-01 basedate = datetime.date(1899,12,30) d = basedate datetime.timedelta(days=xl) return d

# Example:
# >>> print(xldate2date(44948))
# 2023-01-22

That is correct, however, I need to change all values in the column (> 500.000), so I cannot do that 1-by-1. As that question is closed, I hereby open a new question. Is there anyone who can help me to find the correct code to get the right date in the whole column?

CodePudding user response:

When you read the data in using pandas there are tools for the dates. You want to use parse_dates

Documentation for read_excel

example:

import pandas as pd

df = pd.read_excel('file/path/the.xlsx', parse_dates=['Date'])

This will change the date to be datetime64 format which is better than a number.

  • Related