I have a very long dataframe that has day-of-year, decimal hours, and year (int format) in separated columns. I want to convert it to datetime and set it as the index of the dataframe. I found a way to do it but it is extremely inefficient. Is there a better way to do the same thing?
Here is how I do it (with the 5 first values of my dataframe):
import numpy as np
import pandas as pd
import datetime
varr = np.random.rand(5)
dec_h = [1.96667, 1.97083, 1.97500,1.97917, 1.98333]
doy = [154, 154, 154, 154, 154]
year = [2021, 2021, 2021, 2021, 2021]
df = pd.DataFrame()
df['values'] = varr
df['decimal_hour'] = dec_h
df['day_of_year'] = doy
df['year'] = year
# Convert decimal hours into hours, minutes, seconds
hours = []
minutes = []
seconds = []
for i in range(0,len(df)):
hours.append(str(int(df.decimal_hour.values[i])))
minutes.append(str(int((df.decimal_hour.values[i]*60) % 60)))
seconds.append(str(int((df.decimal_hour.values[i]*3600) % 60)))
# Transform doy to date
date1 = []
for i in range(0,len(df)):
date1.append(datetime.datetime(df.year.values[i], 1, 1) datetime.timedelta(int(df.day_of_year.values[i]) - 1))
# Convert to string
year = []
month = []
day = []
for i in range(0,len(df)):
year.append(str(date1[i].year))
month.append(str(date1[i].month))
day.append(str(date1[i].day))
# Concatenate everything
date_time = []
for i in range(0,len(df)):
date_time.append(datetime.datetime.strptime(day[i] '/' month[i] '/' year[i] ' ' hours[i] ':' minutes[i] ':' seconds[i] , "%d/%m/%Y %H:%M:%S"))
df.index = date_time
CodePudding user response:
You can convert the day_of_year
/year
/decimal_hour
to datetime for example like this:
df["date"] = df.apply(
lambda x: pd.to_datetime(x["year"], format="%Y")
pd.DateOffset(days=x["day_of_year"] - 1, hours=x["decimal_hour"]),
axis=1,
)
Prints:
values decimal_hour day_of_year year date
2021-06-03 01:58:00 0.226575 1.96667 154 2021 2021-06-03 01:58:00.012
2021-06-03 01:58:14 0.356103 1.97083 154 2021 2021-06-03 01:58:14.988
2021-06-03 01:58:30 0.566151 1.97500 154 2021 2021-06-03 01:58:30.000
2021-06-03 01:58:45 0.851411 1.97917 154 2021 2021-06-03 01:58:45.012
2021-06-03 01:58:59 0.670402 1.98333 154 2021 2021-06-03 01:58:59.988