I'm having all kinds of trouble combining these two date columns into a single datetime column. The data looks like this:
dfn2.head(3)
Out[134]:
Plant_Name YYMMDD HHMM BestGuess(kWh)
0 BII NEE STIPA 20180101 0100 20715.0
1 BII NEE STIPA 20180101 0200 15742.0
2 BII NEE STIPA 20180101 0300 16934.0
dfn2.dtypes
Out[138]:
Plant_Name object
YYMMDD object
HHMM object
BestGuess(kWh) float64
dtype: object
I've tried several options and I'm not getting the expected result from:
dfn2['Datetime'] = (pd.to_datetime(dfn2['YYMMDD'],format='%Y%m%d').add(pd.to_timedelta(dfn2['HHMM'], 'h')))
dfn2.head(3)
Out[101]:
Plant_Name YYMMDD HHMM BestGuess(kWh) Datetime
0 BII NEE STIPA 20180101 100 20715.0 2018-01-05 04:00:00
1 BII NEE STIPA 20180101 200 15742.0 2018-01-09 08:00:00
2 BII NEE STIPA 20180101 300 16934.0 2018-01-13 12:00:00
I'm expecting the 'Datetime' column of the first 3 rows to look like:
2018-01-01 01:00:00
2018-01-01 02:00:00
2018-01-01 03:00:00
not like what the result shows above. I've also tried this lambda solution and the result looks the same:
dfn2['DateTime'] = dfn2['YYMMDD'].apply(lambda x: pd.to_datetime(str(x), format='%Y%m%d')) (pd.to_timedelta(dfn2.HHMM, unit='H'))
dfn2.head(3)
Out[103]:
Plant_Name YYMMDD HHMM BestGuess(kWh) Datetime DateTime
0 BII NEE STIPA 20180101 100 20715.0 2018-01-05 04:00:00 2018-01-05 04:00:00
1 BII NEE STIPA 20180101 200 15742.0 2018-01-09 08:00:00 2018-01-09 08:00:00
2 BII NEE STIPA 20180101 300 16934.0 2018-01-13 12:00:00 2018-01-13 12:00:00
Am I missing something? thank you,
CodePudding user response:
You can do something like this
pd.to_datetime(df['YYMMDD'].astype(str) ' ' df['HHMM'].astype(str))
P.S. you can do this directly while reading the CSV file using the parameterparse_dates=[['YYMMDD', 'HHMM']]