The goal is to convert a pandas timestamp object to week of the year according to the following user defined calendars for 2021 and 2022 respectively.
I am using the week property of pandas datetime, which worked for dates in 2021, but they are breaking down for the next year. This is the initial function I wrote.
def week(date: pandas.Timestamp) -> int:
"""Convert the date to a week according to client calendar."""
orig_week: int = date.week % 53
return orig_week 1 if date.dayofweek < 5 else orig_week 2
I added the modulo 53 since without it sometimes the rest of the logic gives me numbers like 54 or something. But I am not sure about the internal logic of the pandas week property, so cannot really get a grip on how to convert that week to the above user defined calendar, although it sounds like a simple shift. The problems are around the edge cases (end or beginning of year). So any help will be appreciated.
CodePudding user response:
One solution is to create a custom calendar for the input year. The dates in the year will be converted to a Period representing weeks that start on Saturday and end on Friday.
import pandas as pd
def week(in_date: pd.Timestamp) -> int:
"""Convert the date to a week according to client calendar."""
# Setup a custom calendar table for the year of input date
in_date_year = str(in_date.year)
_df = pd.DataFrame({'Date':pd.date_range(in_date_year '-1-1', in_date_year '-12-31')})
_df['Period'] = _df['Date'].dt.to_period('W-FRI') # define week period that starts on SAT and ends on FRI
_df['Week_Num'] = _df['Period'].dt.week
# Adjust week number for year start
_df['Week_Num'] = np.where(_df['Week_Num'].iloc[0] >= 52, _df['Week_Num'] % 53 1, _df['Week_Num'])
# Adjust week number for year end
_df.iloc[-7:, _df.columns.get_loc('Week_Num')] = np.where(_df['Week_Num'].iloc[-7:] < 52, 53, _df['Week_Num'].iloc[-7:])
# Get week number and return
return _df.loc[_df['Date'] == in_date, 'Week_Num'].iat[0]
The week numbers returned for the Period will be set according to weeks that start on SAT and end on FRI. However, for the year start and year end, the week number may still show week number corresponding to the previous/next year. Hence, we checked and adjusted for this year start/end discrepancy accordingly.
Result:
week(pd.Timestamp('2022-01-01'))
#output
1
week(pd.Timestamp('2022-12-31'))
#output
53
The underlying table is built as follows:
For year 2022:
# print first 10 rows of the year
print(_df.head(10))
Date Period Week_Num
0 2022-01-01 2022-01-01/2022-01-07 1
1 2022-01-02 2022-01-01/2022-01-07 1
2 2022-01-03 2022-01-01/2022-01-07 1
3 2022-01-04 2022-01-01/2022-01-07 1
4 2022-01-05 2022-01-01/2022-01-07 1
5 2022-01-06 2022-01-01/2022-01-07 1
6 2022-01-07 2022-01-01/2022-01-07 1
7 2022-01-08 2022-01-08/2022-01-14 2
8 2022-01-09 2022-01-08/2022-01-14 2
9 2022-01-10 2022-01-08/2022-01-14 2
# print last 10 rows of the year
print(_df.tail(10))
Date Period Week_Num
355 2022-12-22 2022-12-17/2022-12-23 51
356 2022-12-23 2022-12-17/2022-12-23 51
357 2022-12-24 2022-12-24/2022-12-30 52
358 2022-12-25 2022-12-24/2022-12-30 52
359 2022-12-26 2022-12-24/2022-12-30 52
360 2022-12-27 2022-12-24/2022-12-30 52
361 2022-12-28 2022-12-24/2022-12-30 52
362 2022-12-29 2022-12-24/2022-12-30 52
363 2022-12-30 2022-12-24/2022-12-30 52
364 2022-12-31 2022-12-31/2023-01-06 53
For year 2025:
# print first 10 rows of the year
print(_df.head(10))
Date Period Week_Num
0 2025-01-01 2024-12-28/2025-01-03 1
1 2025-01-02 2024-12-28/2025-01-03 1
2 2025-01-03 2024-12-28/2025-01-03 1
3 2025-01-04 2025-01-04/2025-01-10 2
4 2025-01-05 2025-01-04/2025-01-10 2
5 2025-01-06 2025-01-04/2025-01-10 2
6 2025-01-07 2025-01-04/2025-01-10 2
7 2025-01-08 2025-01-04/2025-01-10 2
8 2025-01-09 2025-01-04/2025-01-10 2
9 2025-01-10 2025-01-04/2025-01-10 2
# print last 10 rows of the year
print(_df.tail(10))
Date Period Week_Num
355 2025-12-22 2025-12-20/2025-12-26 52
356 2025-12-23 2025-12-20/2025-12-26 52
357 2025-12-24 2025-12-20/2025-12-26 52
358 2025-12-25 2025-12-20/2025-12-26 52
359 2025-12-26 2025-12-20/2025-12-26 52
360 2025-12-27 2025-12-27/2026-01-02 53
361 2025-12-28 2025-12-27/2026-01-02 53
362 2025-12-29 2025-12-27/2026-01-02 53
363 2025-12-30 2025-12-27/2026-01-02 53
364 2025-12-31 2025-12-27/2026-01-02 53