Home > Software design >  Subtracking into from Dates
Subtracking into from Dates

Time:12-13

I have some data that will looks like this:

       Dates  Delta
0 2022-10-01     10
1 2022-10-01     21
2 2022-10-01     34

I am trying to add a new column, where I can subtract the number in the Delta column from the date in the Dates column. Ideally, the output will look like this (i did this by hand so if the dates are wrong, please excuse me).

        Dates  Delta CalculatedDate
0  2022-10-01     10     2022-09-21
1  2022-10-01     21     2022-09-10
2  2022-10-01     34     2022-08-23

I've tried various versions of this and I'm not having any luck.

# importing libraries to create and manipulate toy data
import pandas as pd
from datetime import datetime, timedelta

# create toy data
df = pd.DataFrame({'Dates': ['2022-10-01', '2022-10-01', '2022-10-01'],
                  'Delta': [10, 21, 34]})

# cast the `Dates` column as dates
df['Dates'] = pd.to_datetime(df['Dates'])



##### Need help here
# Create a new column, showing the calculated date
df['CalculatedDate'] = df['Dates']  - timedelta(days=df['Delta'])

CodePudding user response:

df['CalculatedDate'] = pd.to_datetime(df['Dates']) - pd.TimedeltaIndex(df['Delta'], unit='D')

df

    Dates       Delta   CalculatedDate
0   2022-10-01  10      2022-09-21
1   2022-10-01  21      2022-09-10
2   2022-10-01  34      2022-08-28

CodePudding user response:

Here is one way to do it

# for each row subtract the delta from the date in the row
# using Day offset

df['calculatedDate']= df.apply(lambda x: x['Dates'] - pd.offsets.Day(x['Delta']), axis=1)
df
Dates   Delta   calculatedDate
0   2022-10-01  10  2022-09-21
1   2022-10-01  21  2022-09-10
2   2022-10-01  34  2022-08-28

CodePudding user response:

I see Naveed and Panda has a fix that works great, suggesting the one I came up with as well:

for x in range(len(df)):
    df.loc[x,'CalculatedDate'] = df.loc[x, 'Dates']  - timedelta(days=int(df.loc[x, 'Delta']))
print(df)

Put it in a for loop so that you can index through each row and do each row individually. Also make df['CalculatedDate'] into df.loc[x,'CalculatedDate']. This way you will do each row individually. Hope this helps

  • Related