Home > Blockchain >  Pandas to_datetime Slowing Script
Pandas to_datetime Slowing Script

Time:11-12

I have a script that reads a CSV file and it seems to have slowed down recently (I am sure it used to run faster with this very code). I have narrowed the issue down to this line of code:

data['datetime'] = pd.to_datetime(data['datetime'])

The CSV is quite basic:

2021-11-03 09:30:00-04:00,150.39,150.8,150.3,150.47,9583

Yet to run just 2000 rows is taking ~0.2 seconds which seems much slower than I would have thought.

I have tried updating python and pandas in case it was that but the issue is still there.

Is this amount of time normal and is there anything else I can check or do to improve the speed?

EDIT2 - I had recreated the CSV and this I thought had cured it. Unfortauntely it has not and I am still at ~0.2s for this line of code to run

CodePudding user response:

Try this:

df = pd.read_csv(file, parse_dates=['datetime'])

EDIT

if it doesn't work for date format, try this:

dateparse = lambda x: datetime.strptime(x, '%Y-%m-%d %H:%M:%S')

df = pd.read_csv(file, parse_dates=['datetime'], date_parser=dateparse)
  • Related