Home > OS >  Problem subtracting datetimes python pandas
Problem subtracting datetimes python pandas

Time:11-15

I have the following code snippet:

    ls3['REP'] = pd.to_datetime(ls3['REP']).dt.to_period('M')
    ls3['month'] = pd.to_datetime(ls3['month']).dt.to_period('M')
    ls3['MonthsBetween'] = ls3['REP']-ls3['month']

So rep is a column of values like 2022-05 and month is also in the format YYYY-MM. I want a column that gives me the difference as an integer number of months.

So if my rep column is 2022-05 and my month column is 2022-01, then the difference should be 4. The method above gives me an output <month-end> in every row.

I have also tried

ls3["MonthsBetween"] = (ls3['REP']-ls3["month"])/np.timedelta64(1, "M")

So instead of turning the full dates 2022-05-01 into 2022-05, I subtract say 2022-05-01 - 2022-01-20 and convert this to a month then round it. The rounding causes issues as I don't care how many days are between the two, I really only want to subtract the two month values.

CodePudding user response:

IIUC, you don't have to overwrite your existing columns, perform the computation and assign directly:

ls3['MonthsBetween'] =  (pd.to_datetime(ls3['REP']).dt.to_period('M')
                         -pd.to_datetime(ls3['month']).dt.to_period('M')
                        )

If you want integers:

ls3['MonthsBetween'] =  (pd.to_datetime(ls3['REP']).dt.to_period('M').astype(int)
                        -pd.to_datetime(ls3['month']).dt.to_period('M').astype(int)
                        )

Output:

          REP    month  MonthsBetween
0  2022-05-01  2022-01              4

Used input:

ls3 = pd.DataFrame({'REP': ['2022-05-01'], 'month': ['2022-01']})
  • Related