I have a timeseries with 2 columns, the first being hours after 1 Jan 1970. In this column, a year is only 360 days, with 12 months of 30 days. I need to convert this column into a usable date so that I can analyse the other column based on month, year etc (e.g 1997-Jan-1-1 being year-month-day-hour).
I need to make an array with modulo, to convert the each row of the hours column into hour_of_day, day_of_month, year etc, so that the column is instead a year, month, day and hour. But I don't know how to do this. Appreciate it might be confusing. Any help on doing this would be very helpful.
Input: 233280.5 (in hours)
Output: 1997-01-01-01 (year-day-month-hour)
CodePudding user response:
you can calculate the number of years and add it to the reference date like e.g.
import pandas as pd
import numpy as np
from pandas.tseries.offsets import DateOffset
refdate = pd.Timestamp('1970-01-01')
df = pd.DataFrame({'360d_year_hours': [233280.5]})
# we calculate the number of years and fractional years as helper Series
y_frac, y = np.modf(df['360d_year_hours'] / (24*360))
# now we can calculate the new date's year:
df['datetime'] = pd.Series(refdate DateOffset(years=i) for i in y)
# we need the days in the given year to be able to use y_frac
daysinyear = np.where(df['datetime'].dt.is_leap_year, 366, 365)
# ...so we can update the datetime and round to the hour:
df['datetime'] = (df['datetime'] pd.to_timedelta(y_frac*daysinyear, unit='d')).dt.round('h')
# df['datetime']
# 0 1997-01-01 01:00:00
# Name: datetime, dtype: datetime64[ns]