Home > Enterprise >  How to generate a Pandas weekly Date Range for multiple years
How to generate a Pandas weekly Date Range for multiple years

Time:10-23

I've been banging my head against trying to build a weekly date range for multiple years for about a day now so I figured I'd ask somewhere where people probably know a lot more than me. My goal ultimately is to merge a daily dataset with a weekly one and I need consistent dates across those two datasets in order to get the TimeSeries library that I want to use (DARTs) to accept my dates. My attempt at generating this looks like so:

all_weeks_generated = pd.DataFrame()
all_weeks_generated['week_dt'] = pd.date_range(start = '2001-01-01', end = '2021-09-13', freq='W')
all_weeks_generated['week'] = all_weeks_generated['week_dt'].dt.isocalendar().week
all_weeks_generated['year'] = all_weeks_generated['week_dt'].dt.year

but that spits out things that seem very wrong, for instance:

all_weeks_generated[(all_weeks_generated.year == 2017) & (all_weeks_generated.week == 52)]

returns:

    week_dt week    year
834 2017-01-01  52  2017
886 2017-12-31  52  2017

Jan 1 2017 is not the 52nd week of 2017, sadly. I get that Pandas is trying to tell me that 2017-01-01 is in the 52nd week of 2016 but I'm not sure what to do with that. This means that I can't look across my daily data, determine its week, and then get the beginning of the week in which it might occur. I've dug through as much of the Pandas documentation as I can stomach and it seems as though I'm fundamentally doing something wrong. Any tips on what I've done wrong and whether there's a way to modify my approach or another approach that I might look at which would let me make a list of weeks (I'm agnostic to Mon vs Sun, just want something that has one date per week) starting on Jan 1, 2001 and ending on September 13, 2021?

CodePudding user response:

You can set to 0 all weeks equal to 52 at the start of year:

>>> all_weeks_generated[(all_weeks_generated['week_dt'].dt.month.isin([1, 12]))
                        & (all_weeks_generated['week'] == 52)]

        week_dt  week  year
51   2001-12-30    52  2001
103  2002-12-29    52  2002
155  2003-12-28    52  2003
207  2004-12-26    52  2004
260  2006-01-01    52  2006  # 52 -> 0
312  2006-12-31    52  2006
364  2007-12-30    52  2007
416  2008-12-28    52  2008
468  2009-12-27    52  2009
521  2011-01-02    52  2011  # 52 -> 0
573  2012-01-01    52  2012  # 52 -> 0
625  2012-12-30    52  2012
677  2013-12-29    52  2013
729  2014-12-28    52  2014
781  2015-12-27    52  2015
834  2017-01-01    52  2017  # 52 -> 0
886  2017-12-31    52  2017
938  2018-12-30    52  2018
990  2019-12-29    52  2019
1042 2020-12-27    52  2020
# 52 -> 0
all_weeks_generated.loc[(all_weeks_generated['week_dt'].dt.month == 1)
                        & (all_weeks_generated['week'] == 52), 'week'] = 0
  • Related