I would like to add 4 columns with dates to the following df:
entry_date sum entry_date payment_date
2000-01-05 8543304 2000-01-05 2000-02-18
2000-01-06 370224 2000-01-06 2000-03-24
2000-01-07 12040645 2000-01-07 2000-03-15
2000-01-10 14633220 2000-01-10 2000-03-22
2000-01-10 480648 2000-01-10 2000-03-21
... ... ... ... ... ... ... ... ... ...
2020-12-29 1779688 2020-12-29 2021-01-25
2020-12-30 596003 2020-12-30 2021-03-12
2020-12-30 1412306 2020-12-30 2021-02-04
2020-12-30 7588376 2020-12-30 2021-03-18
2020-12-31 11229071 2020-12-31 2021-02-01
5478 rows × 9 columns
I would like to create 4 columns with the dates: each one to the 15th of the nearest month if entry_date.day < 13.
I have written the following function:
def get_15th_day_of_month(dt):
if dt.day < 13:
return dt.replace(day=15)
new_month = (dt.month 1)
new_year = dt.year
while new_month > 12:
new_month = new_month - 12
new_year = new_year 1
new_day = 15
return datetime(dt.year, new_month, new_day, calendar.monthrange(dt.year, new_month))
But due to lack of experience I don't understand how to implement this effectively. I have written a loop that only returns zeros. It would also be great to implement an offset to the following business day in case the 15th is not a business day.
The final dataframe should (in my dreams) look as follows:
entry_date sum entry_date expiry_date date1 date2 date3 date4
2000-01-05 8543304 2000-01-05 2000-02-18 2000-02-15 2000-03-15 2000-04-15 2000-05-15
2000-01-06 370224 2000-01-06 2000-03-24 2000-01-15 2000-02-15 2000-03-15 2000-03-15
2000-01-07 12040645 2000-01-13 2000-03-15 2000-02-15 2000-03-15 2000-04-15 2000-05-15
2000-01-10 14633220 2000-01-10 2000-03-22
2000-01-10 480648 2000-01-10 2000-03-21
... ... ... ... ... ... ... ... ... ...... ... ... ... ... ... ... ... ... ... ... ... ... ...
2020-12-29 1779688 2020-12-29 2021-01-25
2020-12-30 596003 2020-12-30 2021-03-12
2020-12-30 1412306 2020-12-30 2021-02-04
2020-12-30 7588376 2020-12-30 2021-03-18
2020-12-31 11229071 2020-12-31 2021-02-01
5478 rows × 9 columns
CodePudding user response:
You can use month periods from datetimes, add one month if condition not match and then add 14 Days
(because first day per months already exist):
df['entry_date'] = pd.to_datetime(df['entry_date'])
m = df['entry_date'].dt.day < 13
per = df['entry_date'].dt.to_period('m')
for i in range(4):
df[f'date{i 1}'] = per.where(m, per 1).add(i).dt.to_timestamp() pd.Timedelta('14D')
print (df)
sum entry_date payment_date date1 date2 date3 \
2000-01-05 8543304 2000-01-05 2000-02-18 2000-01-15 2000-02-15 2000-03-15
2000-01-06 370224 2000-01-06 2000-03-24 2000-01-15 2000-02-15 2000-03-15
2000-01-07 12040645 2000-01-13 2000-03-15 2000-02-15 2000-03-15 2000-04-15
2000-01-10 14633220 2000-01-10 2000-03-22 2000-01-15 2000-02-15 2000-03-15
2000-01-10 480648 2000-01-10 2000-03-21 2000-01-15 2000-02-15 2000-03-15
2020-12-29 1779688 2020-12-29 2021-01-25 2021-01-15 2021-02-15 2021-03-15
2020-12-30 596003 2020-12-30 2021-03-12 2021-01-15 2021-02-15 2021-03-15
2020-12-30 1412306 2020-12-30 2021-02-04 2021-01-15 2021-02-15 2021-03-15
2020-12-30 7588376 2020-12-30 2021-03-18 2021-01-15 2021-02-15 2021-03-15
2020-12-31 11229071 2020-12-31 2021-02-01 2021-01-15 2021-02-15 2021-03-15
date4
2000-01-05 2000-04-15
2000-01-06 2000-04-15
2000-01-07 2000-05-15
2000-01-10 2000-04-15
2000-01-10 2000-04-15
2020-12-29 2021-04-15
2020-12-30 2021-04-15
2020-12-30 2021-04-15
2020-12-30 2021-04-15
2020-12-31 2021-04-15