I have this function which writes data partitioned by date and time
df = df.withColumn("year", F.year(col(date_column))) \
.withColumn("month", F.month(col(date_column))) \
.withColumn("day", F.dayofmonth(col(date_column))) \
.withColumn("hour", F.hour(col(date_column)))
df.write.partitionBy("year","month","day","hour").mode("append").format("csv").save(destination)
The output gets written to month=9
how can I make it be like month=09
same goes for hours, e.g. hour=04
.
CodePudding user response:
You could try
.withColumn("month", F.date_format(col(date_column), "MM")))
and
.withColumn("hour", F.date_format(col(date_column), "HH"))