I am working to create a target column that includes the following week's number and year. I have the beginning of this complete, but I am struggling to account for week 52 and the need to reset the week to 1 and add 1 to the year.
Here is the coding for the YEAR_WEEK-targ column that is running successfully.
df_calls_subweek_target['YEAR_WEEK_targ'] = df_calls_subweek_target['YEAR'].astype(str) "_" (df_calls_subweek_target['WEEK'] 1).astype(str)
YEAR_WEEK | YEAR_WEEK_targ |
---|---|
2008_38 | 2008_39 |
2008_39 | 2008_40 |
I am trying to figure out how to code for the first week of the year. Using np.where makes sense to me, but as you can see, I am not an expert.
df_calls_subweek_target['YEAR_WEEK_targ'] = np.where(df_calls_subweek_target['YEAR_WEEK_targ'] == 52, (df_calls_subweek_target['YEAR'] 1).astype(str) "_" (df_calls_subweek_target['WEEK'] ==1).astype(str))
So, need to figure out how to say when week = 52, increase year 1 and week = 1.
Looking for any recommendations on how to approach this case!
CodePudding user response:
def function1(ss:pd.Series):
ss=ss.astype(int)
if ss.col1==52:
ss.col1=1
ss.col0=ss.col0 1
return "{}_{}".format(ss.col0,ss.col1)
df1['YEAR_WEEK_targ'] = df1.YEAR_WEEK.str.split("_",expand=True).add_prefix("col").apply(function1,axis=1)
df1
out
YEAR_WEEK YEAR_WEEK_targ
0 2008_38 2008_38
1 2008_39 2008_39
2 2008_52 2009_1
CodePudding user response:
A tricky solution would be to mask
the values where week=52
and replace them by 0
.
s = df["WEEK"].mask(df["WEEK"].eq(52)).fillna(0, downcast="infer").add(1)
df["YEAR_WEEK_targ"] = df['YEAR'].astype(str) "_" s.astype(str).str.zfill(2)
An alternative, you can also use numpy.where
:
df["YEAR_WEEK_targ"] = np.where(df["WEEK"].eq(52),
df['YEAR'].astype(str) "_01",
df['YEAR'].astype(str) "_" df["WEEK"].astype(str))
NB : For the sake of clarity, I used (df
) instead of (df_calls_subweek_target
)
Output :
print(df)
YEAR WEEK YEAR_WEEK_targ
0 2008 38 2008_39
1 2008 39 2008_40
2 2008 52 2008_01