I have a variable "date_0" which correspond to date of infection in a certain period, and I have to create another variable "date_1" from the previous variable (date_0) with a date condition (>= 33 days). "date_0" and "date_1" are in format %Y-%m-%d %H:%M:%S. So basically I should do a date condition like this :
if df['date_1] >= 33 days then df['date_1] else df['date_0']
And I do this :
df['date_1']= np.where((df['date_1']>= 33d), df['date_1'],df['date_O'])
And I try also with this :
df['date_1']= np.where((df['date_1'].days>= 33), df['date_1'],df['date_O'])
But I failed can you help me please ?
CodePudding user response:
you can use the pandas equivalent to np.where. Ex:
import pandas as pd
df = pd.DataFrame({'date_0': [pd.Timestamp('2022-01-01'), pd.Timestamp('2022-02-01')],
'date_1': [pd.Timestamp('2022-01-29'), pd.Timestamp('2022-03-15')]})
df['days_diff'] = (df['date_1']-df['date_0']).dt.days
# date_select:
# use date 1 if the difference is greater equal 33 days else use date 0
df['date_select'] = df['date_1'].where(df['days_diff'] >= 33, df['date_0'])
df
date_0 date_1 days_diff date_select
0 2022-01-01 2022-01-29 28 2022-01-01
1 2022-02-01 2022-03-15 42 2022-03-15