Home > database >  Revert multiindex Date and time to singleindex datetime
Revert multiindex Date and time to singleindex datetime

Time:03-24

last two weeks I had to plot different time series with different datetime formats. There was no problem converting those into one format. Now I face a new challenge and am struggling solving it. All data (csv) I got from my colleagues had one specific field with both date and time inside --> read it into a pandas data frame and reformat the datetime format. Today I got from a different system new data to process with two index cols one for date and a second one for time. My problem here is that those index cols are designed as multiindex cols (see below).

Old Data:

Datetime Data
01/01/2021 00:00 0,15
01/01/2021 00:15 5,18
Datetime;Data
2021-01-01 00:15:00;1,829
2021-01-01 00:30:00;1,675
2021-01-01 00:45:00;1,501

New Data:

Date Time Data
01/01/2021 00:00 0,15
00:15 5,18
Date; Time; Data
01/01/2021;00:15;71,04
;00:30;62,8
;00:45;73,2
;01:00;73,48
;01:15;66,8
;01:30;67,48
;01:45;71,12
;02:00;73,88

After reading this csv into a pandas dataframe with following code, I am not able to add the time specific data to the existing data because the indexes are not equal. obtain = pd.read_csv('csv/data.csv', sep=';', encoding='utf-8', index_col=['Date', 'Time'], names=['Date', 'Time', 'Data'], dtype={'Date' : 'string', 'Time': 'string', \ 'Data': 'float'}, decimal=',')

How do I reset the index of the new data to a single Index in a pandas dataframe as a datetime column?

I tried to just convert the index to datetime as following obtain.index = pd.to_datetime(obtain.index.map(' '.join)) obtain.index = pd.to_datetime(obtain.index)

CodePudding user response:

you ca nadd parameter parse_dates if repeated Date values:

obtain = pd.read_csv('csv/data.csv', 
                     sep=';', 
                     encoding='utf-8', 
                     index_col=['Date', 'Time'], 
                     parse_dates=['Date', 'Time'], 
                     names=['Date', 'Time', 'Data'], 
                     dtype={'Data': 'float'},
                     decimal=',')

But if there are no dates:

obtain = pd.read_csv('csv/data.csv', 
                     sep=';', 
                     encoding='utf-8', 
                     names=['Date', 'Time', 'Data'], 
                     dtype={'Data': 'float'},
                     decimal=',')


obtain.index = pd.to_datetime(obtain.pop('Date').ffill()   ' '   obtain.pop('Time'))
  • Related