Home > Blockchain >  Pandas update a date column
Pandas update a date column

Time:10-20

I wish to update dates in date column based on a condition. But i'm unable to get it working:

d_ = {
    'id': [1,2,3],
    'dt1': ['28-10-2016', '18-11-2016', '21-12-2016'],
}
df = pd.DataFrame(d_)
df['dt1'] = pd.to_datetime(df['dt1'])
df

id  dt1
1   2016-10-28
2   2016-11-18
3   2016-12-21

df['dt1'] = np.where(df['id']==1, pd.to_datetime('2018-01-01'), df['dt1'])
df

id  dt1
1   2018-01-01 00:00:00
2   1479427200000000000
3   1482278400000000000

df.dtypes
id      int64
dt1    object
dtype: object

The other values in the column update to integer and the datatype of the column changes to 'object'.

CodePudding user response:

Use loc to avoid conversion to numpy array:

df.loc[df['id']==1, 'dt1'] = pd.to_datetime('2018-01-01')

Or mask:

df['dt1'] = df['dt1'].mask(df['id']==1, pd.to_datetime('2018-01-01'))

output:

   id        dt1
0   1 2018-01-01
1   2 2016-11-18
2   3 2016-12-21

CodePudding user response:

You can use df.apply() to prevent conversion:

df['dt1'] = df.apply(lambda row: pd.to_datetime('2018-01-01') if row['id']==1 else row['dt1'], axis=1)

>>    id        dt1
>> 0   1 2018-01-01
>> 1   2 2016-11-18
>> 2   3 2016-12-21
  • Related