Home > Net >  How to convert year, month, day, hour/minute columns into a single datetime column?
How to convert year, month, day, hour/minute columns into a single datetime column?

Time:01-14

I have the following data format with different columns for year, month, day, and hour_minute (the first two digits are hour and the last two digits are minutes). How do I create a new column in datetime format by combining all of these existing columns?

YEAR MONTH DAY HOUR_MINUTE
2015 1 15 0010
2015 1 2 0020
2015 1 15 0045
2015 1 15 2110
2015 10 21 2359

I have tried the following but have no luck. Thank you for your advise your advise.

df["new_column"]= pd.to_datetime(df[["YEAR", "MONTH", "DAY","HOUR_MINUTE"]])

CodePudding user response:

You need to split HOUR_MINUTE column to HOUR and MINUTE

df["HOUR"] = df["HOUR_MINUTE"].str[0:2]
df["MINUTE"] = df.pop("HOUR_MINUTE").str[2:4]
df["new_column"] = pd.to_datetime(df[["YEAR", "MONTH", "DAY", "HOUR", "MINUTE"]], format="%Y-%m-%d %H:%M")
print(df)

Output:

   YEAR  MONTH  DAY HOUR MINUTE          new_column
0  2015      1   15   00     10 2015-01-15 00:10:00
1  2015      1    2   00     20 2015-01-02 00:20:00
2  2015      1   15   00     45 2015-01-15 00:45:00
3  2015      1   15   21     10 2015-01-15 21:10:00
4  2015     10   21   23     59 2015-10-21 23:59:00

CodePudding user response:

You can apply on entire df if you have only year,month and hour_minute columns like this

df.apply(lambda row: pd.to_datetime(''.join(row.values.astype(str)), format="%Y%m%d%H%M") ,axis=1)
Out[198]: 
0   2015-11-05 00:10:00
1   2015-01-20 02:00:00
2   2015-11-05 04:05:00
3   2015-11-05 21:10:00
4   2015-10-21 23:59:00
dtype: datetime64[ns]

if there are other columns as well then just select the required columns then apply

df[['YEAR', 'MONTH', 'DAY', 'HOUR_MINUTE']].apply(lambda row: pd.to_datetime(''.join(row.values.astype(str)), format="%Y%m%d%H%M") ,axis=1)
Out[201]: 
0   2015-11-05 00:10:00
1   2015-01-20 02:00:00
2   2015-11-05 04:05:00
3   2015-11-05 21:10:00
4   2015-10-21 23:59:00
dtype: datetime64[ns]

if you want new_column to be assigned to df then

df['new_column'] = df[['YEAR', 'MONTH', 'DAY', 'HOUR_MINUTE']].apply(lambda row: pd.to_datetime(''.join(row.values.astype(str)), format="%Y%m%d%H%M") ,axis=1)
df
Out[205]: 
   YEAR  MONTH  DAY HOUR_MINUTE          new_column
0  2015      1   15        0010 2015-11-05 00:10:00
1  2015      1    2        0020 2015-01-20 02:00:00
2  2015      1   15          45 2015-11-05 04:05:00
3  2015      1   15        2110 2015-11-05 21:10:00
4  2015     10   21        2359 2015-10-21 23:59:00
  • Related