Home > OS >  Pandas Convert YYMMDD & HHMM column to single DateTime Column
Pandas Convert YYMMDD & HHMM column to single DateTime Column

Time:09-24

I'm having all kinds of trouble combining these two date columns into a single datetime column. The data looks like this:

dfn2.head(3)
Out[134]: 
      Plant_Name    YYMMDD  HHMM  BestGuess(kWh)
0  BII NEE STIPA  20180101  0100         20715.0
1  BII NEE STIPA  20180101  0200         15742.0
2  BII NEE STIPA  20180101  0300         16934.0

dfn2.dtypes
Out[138]: 
Plant_Name         object
YYMMDD             object
HHMM               object
BestGuess(kWh)    float64
dtype: object

I've tried several options and I'm not getting the expected result from:

dfn2['Datetime'] = (pd.to_datetime(dfn2['YYMMDD'],format='%Y%m%d').add(pd.to_timedelta(dfn2['HHMM'], 'h')))

dfn2.head(3)
Out[101]: 
      Plant_Name    YYMMDD  HHMM  BestGuess(kWh)            Datetime
0  BII NEE STIPA  20180101   100         20715.0 2018-01-05 04:00:00
1  BII NEE STIPA  20180101   200         15742.0 2018-01-09 08:00:00
2  BII NEE STIPA  20180101   300         16934.0 2018-01-13 12:00:00

I'm expecting the 'Datetime' column of the first 3 rows to look like:

2018-01-01 01:00:00
2018-01-01 02:00:00
2018-01-01 03:00:00 

not like what the result shows above. I've also tried this lambda solution and the result looks the same:

dfn2['DateTime'] = dfn2['YYMMDD'].apply(lambda x: pd.to_datetime(str(x), format='%Y%m%d'))   (pd.to_timedelta(dfn2.HHMM, unit='H'))

dfn2.head(3)
Out[103]: 
      Plant_Name    YYMMDD  HHMM  BestGuess(kWh)            Datetime            DateTime
0  BII NEE STIPA  20180101   100         20715.0 2018-01-05 04:00:00 2018-01-05 04:00:00
1  BII NEE STIPA  20180101   200         15742.0 2018-01-09 08:00:00 2018-01-09 08:00:00
2  BII NEE STIPA  20180101   300         16934.0 2018-01-13 12:00:00 2018-01-13 12:00:00

Am I missing something? thank you,

CodePudding user response:

You can do something like this

pd.to_datetime(df['YYMMDD'].astype(str)   ' '   df['HHMM'].astype(str))

P.S. you can do this directly while reading the CSV file using the parameterparse_dates=[['YYMMDD', 'HHMM']]

  • Related