Home > Software engineering >  Pandas dataframe to_datetime() gives error
Pandas dataframe to_datetime() gives error

Time:03-11

Goal: I read measurement data from a .csv and convert them to a dataframe. Then I add the date information from the filename to the time string which is already in the dataframe. And the last step is to convert this string with date and time informatin into a datetime object.

First steps that worked:

import pandas as pd
filename = '2022_02_14_data_0.csv
path = 'C:/Users/ma1075116/switchdrive/100_Schaltag/100_Digitales_Model/Messungen/'
measData = pd.read_csv(path filename, sep = '\t', header = [0,1], encoding = 'ISO-8859-1')
# add the date to the timestamp string
measData['Timestamp'] = filename[:11] measData['Timestamp']

An object in the Dataframe measData['Timestamp'] has now exacty a string with the following pattern:

'2022_02_14_00:00:06'

Now I want to convert this string to datetime:

measData['Timestamp'] = pd.to_datetime(measData['Timestamp'], format= '%Y_%m_%d_%H:%M:%S')

This raises the error:

ValueError: to assemble mappings requires at least that [year, month, day] be specified: [day,month,year] is missing

Why do I get this error and how can I avoid it? I am pretty shure that the format is correct.

Edit: I wrote a sample code which should do exactly the same, and it works:

filename = '2022_02_14_data_0.csv'
timestamps = {'Timestamp': ['00:00:00', '00:00:01', '00:00:04']}
testFrame = pd.DataFrame(timestamps)
testFrame['Timestamp'] = testFrame['Timestamp']#
testFrame['Timestamp'] = filename[:11] testFrame['Timestamp']
testFrame['Timestamp'] = pd.to_datetime(testFrame['Timestamp'], format= '%Y_%m_%d_%X') 

My next step is now to check if all timestamp entries in the dataframe have the same format.

CodePudding user response:

You can do it like this using datetime.datetime.strptime and apply in the column.

Recreating your dataset:

    import datetime
    import pandas as pd

    data={'2016_03_29_08:15:27', '2017_03_29_08:18:27', 
    '2018_06_30_08:15:27','2019_07_29_08:15:27'}
     columns={'time'}

    df= pd.DataFrame(data=data, columns=columns)

Applying the desired transformation:

   df['time'] = df.apply(lambda row : datetime.datetime.strptime(row['time'], 
   '%Y_%m_%d_%H:%M:%S'), axis=1)

CodePudding user response:

Your format seems to be missing an underscore after day.

This works for me:

import pandas as pd
date_str = '2022_02_14_00:00:06'
pd.to_datetime(date_str, format= '%Y_%m_%d_%H:%M:%S')

EDIT:

This works fine for me (measData["Timestamp"] is a pd.Series):

import pandas as pd

measData = pd.DataFrame({"Timestamp": ['2022_02_14_00:00:06', '2022_02_14_13:55:06', '2022_02_14_12:00:06']})
pd.to_datetime(measData["Timestamp"], format= '%Y_%m_%d_%H:%M:%S')

The only way I found to reproduce your error is this (measData is a pd.DataFrame):

import pandas as pd

measData = pd.DataFrame({"Timestamp": ['2022_02_14_00:00:06', '2022_02_14_13:55:06', '2022_02_14_12:00:06']})
pd.to_datetime(measData, format= '%Y_%m_%d_%H:%M:%S')

So make sure that what you are putting into to_datetime is a pd.Series. If this does not help, please provide a small sample of your data.

  • Related