I'm attempting to build a column of dates, based on dates from a different column in a dataframe. It boils down to my company fiscal calendar resetting on the 22nd of every month. If a date in my source column is before the 22nd of the month, then my date needs to be the 22nd. If it's after the 22nd it needs to be on the 22nd of the following month.
I built some code, but I can't get it functioning correctly, I get an error The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I was thinking I'd break the old date apart into separate columns, process each column, then join them back up into a date and remove all the columns I'd use to process.
Here is the direction of the current code:
df_connection['Year_OLD'] = df_connection['Date Created'].dt.year
df_connection['Mon_OLD'] = df_connection['Date Created'].dt.month
df_connection['Day_OLD'] = df_connection['Date Created'].dt.day
def func_Year(y, m, d):
if (m == 12) & (d >= 22):
return y 1
else:
return y
def func_Mon(y, m, d):
if (m < 12) & (d >=22):
return m 1
elif (m == 12) & (d >= 22):
return 1
else:
return m
df_connection['Year'] = func_Year(df_connection['Year_OLD'], df_connection['Mon_OLD'], df_connection['Day_OLD'])
CodePudding user response:
In your code you are passing dataframe directly to the function which is not correct
You can pass values one by one by using lamda function
df_connection['Year_OLD'] = df_connection['Date Created'].dt.year
df_connection['Mon_OLD'] = df_connection['Date Created'].dt.month
df_connection['Day_OLD'] = df_connection['Date Created'].dt.day
def func_Year(y, m, d):
if (m == 12) & (d >= 22):
return y 1
else:
return y
def func_Mon(y, m, d):
if (m < 12) & (d >=22):
return m 1
elif (m == 12) & (d >= 22):
return 1
else:
return m
df_connection['Year'] = df_connection.apply(lambda x: func_Year(x['Year_OLD'], x['Mon_OLD'], x['Day_OLD']))
CodePudding user response:
IIUC, you can use mask
to check if a specific day is greater or equal 22. If so, use DateOffset
to add one month. Afterwards replace all days with 22.
# create example data
daterange = pd.DataFrame(pd.date_range('2022-01-01', '2022-12-31'), columns=['Date Created']
daterange['Date Created'].mask(
daterange['Date Created'].dt.day >= 22,
daterange['Date Created'] pd.DateOffset(months=1)
).apply(lambda x: x.replace(day=22))