Home > Mobile >  How can I generate a dataframe with the days of the year based on an input year?
How can I generate a dataframe with the days of the year based on an input year?

Time:04-06

I would like to generate a dataframe with the days of the year appended to the first column based on a specified year. How can I do this? I am using the pandas date_range module.

Here is what I have tried:

#Import modules
import pandas as pd
import numpy as np
import datetime as dt

#Specify the year
year = 1976

#Create dataframe
df = pd.Series(pd.date_range(year, periods=365, freq='D'))

print(df)

The result:

0     1970-01-01 00:00:00.000001976
1     1970-01-02 00:00:00.000001976
2     1970-01-03 00:00:00.000001976
3     1970-01-04 00:00:00.000001976
4     1970-01-05 00:00:00.000001976
                   ...             
360   1970-12-27 00:00:00.000001976
361   1970-12-28 00:00:00.000001976
362   1970-12-29 00:00:00.000001976
363   1970-12-30 00:00:00.000001976
364   1970-12-31 00:00:00.000001976
Length: 365, dtype: datetime64[ns]

The year is wrong here, I need it to be 1976. Additionally, all I need is a "Day of the Year" column with the number of rows corresponding to the number of days in the year (this would account for leap years). How can I fix this?

The output should be a dataframe that looks like this (it should extend all the way to the last day of the year):

d = {
    'year': [1976, 1976, 1976, 1976, 1976, 1976],
    'day of the year': [1, 2, 3, 4, 5, 6]
}
df1 = pd.DataFrame(data=d)
df1

CodePudding user response:

year = 1976
dates = pd.Series(pd.date_range(str(year)   "-01-01", str(year)   "-12-31", freq="D"))
days = dates.diff().astype("timedelta64[D]").fillna(1).cumsum()
df = pd.DataFrame({"year": dates.dt.year, "days": days})
df = df.set_index(dates)
print(df)
#             year   days
# 1976-01-01  1976    1.0
# 1976-01-02  1976    2.0
# 1976-01-03  1976    3.0
# 1976-01-04  1976    4.0
# 1976-01-05  1976    5.0
# ...          ...    ...
# 1976-12-27  1976  362.0
# 1976-12-28  1976  363.0
# 1976-12-29  1976  364.0
# 1976-12-30  1976  365.0
# 1976-12-31  1976  366.0

# [366 rows x 2 columns]

Or

import calendar

year = 1976

n_days = 366 if calendar.isleap(year) else 365
df = pd.DataFrame({"year": year,
                   "days": range(1, n_days   1)})
  • Related