I have data of the form,
Year | Day | Hour | Minute | Field magnitude average, nT | Speed, km/s | AE-index, nT | SYM/H, nT | ASY/H, nT | PCN-index | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 2017 | 250 | 0 | 0 | 8.27 | 390.6 | 117 | 42 | 20 | 2.00 |
1 | 2017 | 250 | 0 | 1 | 3.14 | 405.3 | 121 | 42 | 19 | 1.64 |
2 | 2017 | 250 | 0 | 2 | 3.86 | 434.2 | 124 | 42 | 21 | 1.72 |
3 | 2017 | 250 | 0 | 3 | 4.17 | 473.2 | 141 | 42 | 19 | 1.92 |
4 | 2017 | 250 | 0 | 4 | 3.43 | 497.0 | 107 | 41 | 19 | 1.70 |
How can I create a datetime object in Pandas using the Year, Day, Hour, Minute columns? What's the best approach?
I tried using
pd.to_datetime(dict(year=df.Year,day=df.Day,month=None))
which failed!
Also tried,
df["DateTime"] = pd.to_datetime(df[["Year","Day","Hour","Minute"]]).dt.strftime('%d/%y %H:%M')
CodePudding user response:
You can merge as a single string and use to_datetime
with a custom format ('%Y-%j-%H-%M'
for Year-DayOfYear-Hours-Minutes):
pd.to_datetime(df[['Year', 'Day', 'Hour', 'Minute']]
.astype(str).agg('-'.join, axis=1), format='%Y-%j-%H-%M')
Output:
0 2017-09-07 00:00:00
1 2017-09-07 00:01:00
2 2017-09-07 00:02:00
3 2017-09-07 00:03:00
4 2017-09-07 00:04:00
dtype: datetime64[ns]
CodePudding user response:
I think your Day column is JULIAN date, i.e. when 1st Jan is 1, then julian 250 is 9th July (in non leap year). If that is so, you can create additional column YJ to concatenate Year and Day and then apply datetime creation.
import pandas as pd
data = [{'Year':2017, 'Day':250, 'Hour':0, 'Minute':0, 'Data': 8.27},
{'Year':2017, 'Day':250, 'Hour':1, 'Minute':0, 'Data': 3.14}]
df= pd.DataFrame(data)
df['YJ'] = (df['Year'].apply(str) df['Day'].apply(str)) #convert to string
df['DateTime'] = (df['YJ'].apply(lambda x: dt.datetime.strptime(x,'%Y%j'))) #new column
#output: df.head()
Year | Day | Hour | Minute | Data | YJ | DateTime | |
---|---|---|---|---|---|---|---|
0 | 2017 | 250 | 0 | 0 | 8.27 | 2017250 | 2017-09-07 |
1 | 2017 | 250 | 1 | 0 | 3.14 | 2017250 | 2017-09-07 |