last two weeks I had to plot different time series with different datetime formats. There was no problem converting those into one format. Now I face a new challenge and am struggling solving it. All data (csv) I got from my colleagues had one specific field with both date and time inside --> read it into a pandas data frame and reformat the datetime format. Today I got from a different system new data to process with two index cols one for date and a second one for time. My problem here is that those index cols are designed as multiindex cols (see below).
Old Data:
Datetime | Data |
---|---|
01/01/2021 00:00 | 0,15 |
01/01/2021 00:15 | 5,18 |
Datetime;Data
2021-01-01 00:15:00;1,829
2021-01-01 00:30:00;1,675
2021-01-01 00:45:00;1,501
New Data:
Date | Time | Data |
---|---|---|
01/01/2021 | 00:00 | 0,15 |
00:15 | 5,18 |
Date; Time; Data
01/01/2021;00:15;71,04
;00:30;62,8
;00:45;73,2
;01:00;73,48
;01:15;66,8
;01:30;67,48
;01:45;71,12
;02:00;73,88
After reading this csv into a pandas dataframe with following code, I am not able to add the time specific data to the existing data because the indexes are not equal.
obtain = pd.read_csv('csv/data.csv', sep=';', encoding='utf-8', index_col=['Date', 'Time'], names=['Date', 'Time', 'Data'], dtype={'Date' : 'string', 'Time': 'string', \ 'Data': 'float'}, decimal=',')
How do I reset the index of the new data to a single Index in a pandas dataframe as a datetime column?
I tried to just convert the index to datetime as following
obtain.index = pd.to_datetime(obtain.index.map(' '.join))
obtain.index = pd.to_datetime(obtain.index)
CodePudding user response:
you ca nadd parameter parse_dates
if repeated Date
values:
obtain = pd.read_csv('csv/data.csv',
sep=';',
encoding='utf-8',
index_col=['Date', 'Time'],
parse_dates=['Date', 'Time'],
names=['Date', 'Time', 'Data'],
dtype={'Data': 'float'},
decimal=',')
But if there are no dates:
obtain = pd.read_csv('csv/data.csv',
sep=';',
encoding='utf-8',
names=['Date', 'Time', 'Data'],
dtype={'Data': 'float'},
decimal=',')
obtain.index = pd.to_datetime(obtain.pop('Date').ffill() ' ' obtain.pop('Time'))