Home > Net >  Converting Dataframe column to datetime doesn't complete
Converting Dataframe column to datetime doesn't complete

Time:11-17

I am trying to convert a column of a large dataset (660k rows) into datetime type in Jupyter notebook. I have found two ways to do it:

pd.to_datetime(df['local_time'],format='%d/%m/%Y') 
df['local_time'].astype("datetime64[ns]")

but none of them complete even in couple hours. Is there a way to make it faster? It doesn't look that any of the laptop's resources would be used 100%. My laptop is Acer S7. Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz. Ram 8Gb

CodePudding user response:

A 660k row dataset in pandas with an i7 and 8GB RAM should not take more than seconds to perform that transformation.

both methods are acceptable. Could you please supply an example of the column?

CodePudding user response:

I am not sure what was the reason behind it, but I was converting multiple columns at once and the time increased many many times.

df[['date_1', 'date_2', 'date_3', 'date_4']] = df[['date_1', 'date_2', 'date_3', 'date_4']].astype('datetime64[ns]')

after doing everything in separate steps, time became decent

df['date_1'] = df['date_1'].astype('datetime64[ns]')
df['date_2'] = df['date_2'].astype('datetime64[ns]')
df['date_3'] = df['date_3'].astype('datetime64[ns]')
df['date_4'] = df['date_4'].astype('datetime64[ns]')
  • Related