Home > Net >  Add timedelta to date column based on conditions
Add timedelta to date column based on conditions

Time:06-20

Here is my sample data from df.to_dict()

{'Date': {0: Timestamp('2022-01-01 00:00:00'),
  1: Timestamp('2022-01-02 00:00:00'),
  2: Timestamp('2022-01-03 00:00:00'),
  3: Timestamp('2022-01-04 00:00:00')},
 'tf': {0: 'D', 1: 'W', 2: 'M', 3: 'Y'},
 'tf_int': {0: 1, 1: 3, 2: 2, 3: 2}}

//Sorry this isn't a pretty table, I couldn't get the normal df to be pasted here

I want to make a new column based that will add the required timedelta to the Date column.

For eg, for my first row, I want to add 1 day to the date (2022-1-1) so the result would be 2022-1-2

Is there any vectorised way to achieve this ? I don't using multiple .loc[] calls to accomplish this separately for days,weeks, months etc; But I would prefer the solution to be vectorised

My unsuccessful attempt with .loc[]

df.loc[df['tf'] =='D','publish_date'] = df['Date'].dt.date   dt.timedelta(days = df['tf_int'])

But this results in error because df['tf_int'] is a series and not an exact value

Edit: For my use case, adding one month will add affect only the month component, and not do anything to the date

CodePudding user response:

If need exact year and months is necessary use list comprhension solution with offsets.DateOffset:

d = {'D':'days','W':'weeks', 'M':'months', 'Y':'years'}

df['publish_date'] = [z   pd.offsets.DateOffset(**{d[y]: x}) 
                     for x, y, z in zip(df['tf_int'], df['tf'], df['Date'].dt.normalize())]
print (df)
        Date tf  tf_int publish_date
0 2022-01-01  D       1   2022-01-02
1 2022-01-02  W       3   2022-01-23
2 2022-01-03  M       2   2022-03-03
3 2022-01-04  Y       2   2024-01-04

CodePudding user response:

If you accept to have approximations for Month/Year, a vectorial solution could be something like:

mapper = {'D': 1, 'W': 7, 'M': 30, 'Y': 365}
df['Date2'] = (pd.to_datetime(df['Date'])
                 .add(pd.to_timedelta(df['tf'].map(mapper).mul(df['tf_int']),
                                      unit='D'))
              )

output:

        Date tf  tf_int      Date2
0 2022-01-01  D       1 2022-01-02
1 2022-01-02  W       3 2022-01-23
2 2022-01-03  M       2 2022-03-04
3 2022-01-04  Y       2 2024-01-04
  • Related