Home > Blockchain >  Change the day of the date to a particular day
Change the day of the date to a particular day

Time:06-15

I basically have a requirement that needs a column that as the PeriodEndDate in. The period always ends on the 23rd of the month.

I need to take a date from a column in this case it is the last day of the month each day, and set the "day" of that date to be "23".

I have tried doing the following:

.withColumn("periodEndDate", change_day(jsonDF2.periodDate, sf.lit(23)))

cannot import name 'change_day' from 'pyspark.sql.functions'

CodePudding user response:

You can use make_date

from pyspark.sql import functions as F
df = spark.createDataFrame([('2022-05-31',)], ['periodDate'])

df = df.withColumn('periodEndDate', F.expr("make_date(year(periodDate), month(periodDate), 23)"))

df.show()
#  ---------- ------------- 
# |periodDate|periodEndDate|
#  ---------- ------------- 
# |2022-05-31|   2022-05-23|
#  ---------- ------------- 

CodePudding user response:

As far as I know, there is no function change_day however, you can make one using UDF. Pass a date and replace day.

Example:

from datetime import datetime
from pyspark.sql import SparkSession
from pyspark.sql.types import TimestampType
from pyspark.sql import functions as F


def change_day(date, day):
    return date.replace(day=day)


change_day = F.udf(change_day, TimestampType())

spark = SparkSession.builder.getOrCreate()
df = spark.createDataFrame([{"date": datetime(2022, 1, 31)}])
df = df.withColumn("23day", change_day(F.col("date"), F.lit(23)))

df.show(20, False)

Result:

 ------------------- -------------------                                        
|date               |23day              |
 ------------------- ------------------- 
|2022-01-31 00:00:00|2022-01-23 00:00:00|
 ------------------- ------------------- 
  • Related