I have a parameter variable which I want to turn into a date variable indicating the last working day of the month. From what I've read I can turn d=date.today()
into the last working day but not just 202109
. My skript looks like this:
from pandas.tseries.offsets import BMonthEnd
from datetime import date
date = 202109
d = date_format(date, 'yyyyMM')
offset = BMonthEnd()
lastworkingday = offset.rollforward(d)
I'm pretty sure it goes wrong when turning date into d but I do not know how to fix it. Additionally, can you tell me how to keep only the date and drop the time in the result? Thank you.
CodePudding user response:
IIUC, you want a column of DateType
in your Spark dataframe that is equal to the last working day of the month specified in the input variable date
. Here is a solution
from datetime import datetime
from pandas.tseries.offsets import BMonthEnd
import pyspark.sql.functions as F
# input variable
date = 202109
# get last day of current month (in string format)
d = datetime.strptime(str(date), '%Y%m')
offset = BMonthEnd()
last_working_day = offset.rollforward(d)
my_date = last_working_day.strftime('%Y-%m-%d')
print(my_date)
# 2021-09-30
# add column to spark dataframe
df = df.withColumn('my_date', F.to_date(F.lit(my_date)))