Home > Mobile >  Pandas convert date format to an integer using a reference date
Pandas convert date format to an integer using a reference date

Time:03-29

I am trying to convert columns of data in date format to an integer using a referenced date, 01/10/2021 (0). For example, 01/07/2021 would equal 92 and 10/10/2021 equals -9.

It can be done in excel but pandas generate a TypeError when trying to subtract the dates from the reference date.

TypeError: unsupported operand type(s) for -: 'str' and 'str'

CodePudding user response:

You need to convert date column from str to datetime first with pandas.to_datetime.

Here is an example:

import pandas as pd
d = pd.DataFrame({'date': ['01/10/2021', '01/07/2021', '10/10/2021']})
d['date'] = pd.to_datetime(d['date'], dayfirst=True)
d['date'][0] - d['date']

Output:

0    0 days
1   92 days
2   -9 days

CodePudding user response:

IIUC, convert to_datetime and perform a subtraction with the reference timestamp. Then extract the days with dt.days:

df = pd.DataFrame({'date': ['01/07/2021', '01/10/2021', '10/10/2021']})

df['delta'] = (pd.to_datetime(df['date'], dayfirst=True)
                 .rsub(pd.to_datetime('01/10/2021', dayfirst=True))
                 .dt.days
              )

Output:

         date  delta
0  01/07/2021     92
1  01/10/2021      0
2  10/10/2021     -9
  • Related