Home > OS >  Pandas int type to date type
Pandas int type to date type

Time:12-17

i am new to pandas and I try to convert an int type-column to an date type-column .

The int in the df is something like: 10712 (first day, then month, then year).

I tried solving this with:

df_date = pd.to_datetime(df['Date'], format='%d%m%Y')

but I always get the following value error:

time data '10712' does not match format '%d%m%Y' (match)

Thank you for your help :)

CodePudding user response:

You should use %y (2-digit year) instead of %Y (4-digit year). But that is not enough.

The format %d%m%y converts 10712 to 10-07-2012, not to 1-07-2012 as you expect.

That's because of the following feature of the underlying strptime:

When used with the strptime() method, the leading zero is optional for %m

A workaround could be to convert to a format properly understandable by strptime (and to_datetime):

>>> df = pd.DataFrame({'date': [10712, 20813, 30914]})
>>> df 
    date
0  10712
1  20813
2  30914

>>> df1 = df.date.astype(str).str.replace('(\d )(\d\d)(\d\d)',
                                         r'\2/\1/\3', regex=True)
>>> df1
0    07/1/12
1    08/2/13
2    09/3/14

>>> pd.to_datetime(df1)
0   2012-07-01
1   2013-08-02
2   2014-09-03

CodePudding user response:

Use %y year specifier to parse year without century digits:

In [654]: pd.to_datetime(10712, format='%d%m%y')
Out[654]: Timestamp('2012-07-10 00:00:00')

CodePudding user response:

pandas.to_datetime only work with '%Y%m%d', that is why you can use %d%m%Y.

example =>

>>> pd.to_datetime('13000101', format='%Y%m%d', errors='ignore')
datetime.datetime(1300, 1, 1, 0, 0)
>>> pd.to_datetime('13000101', format='%d%m%Y', errors='ignore')
'13000101'
>>> pd.to_datetime('13000101', format='%m%d%Y', errors='ignore')
'13000101'
  • Related