I wish to update dates in date column based on a condition. But i'm unable to get it working:
d_ = {
'id': [1,2,3],
'dt1': ['28-10-2016', '18-11-2016', '21-12-2016'],
}
df = pd.DataFrame(d_)
df['dt1'] = pd.to_datetime(df['dt1'])
df
id dt1
1 2016-10-28
2 2016-11-18
3 2016-12-21
df['dt1'] = np.where(df['id']==1, pd.to_datetime('2018-01-01'), df['dt1'])
df
id dt1
1 2018-01-01 00:00:00
2 1479427200000000000
3 1482278400000000000
df.dtypes
id int64
dt1 object
dtype: object
The other values in the column update to integer and the datatype of the column changes to 'object'.
CodePudding user response:
Use loc
to avoid conversion to numpy array:
df.loc[df['id']==1, 'dt1'] = pd.to_datetime('2018-01-01')
Or mask
:
df['dt1'] = df['dt1'].mask(df['id']==1, pd.to_datetime('2018-01-01'))
output:
id dt1
0 1 2018-01-01
1 2 2016-11-18
2 3 2016-12-21
CodePudding user response:
You can use df.apply()
to prevent conversion:
df['dt1'] = df.apply(lambda row: pd.to_datetime('2018-01-01') if row['id']==1 else row['dt1'], axis=1)
>> id dt1
>> 0 1 2018-01-01
>> 1 2 2016-11-18
>> 2 3 2016-12-21