Home > OS >  Add leading zero to PySpark time components
Add leading zero to PySpark time components

Time:10-12

I have this function which writes data partitioned by date and time

df = df.withColumn("year", F.year(col(date_column))) \
    .withColumn("month", F.month(col(date_column))) \
    .withColumn("day", F.dayofmonth(col(date_column))) \
    .withColumn("hour", F.hour(col(date_column))) 
    
df.write.partitionBy("year","month","day","hour").mode("append").format("csv").save(destination)

The output gets written to month=9 how can I make it be like month=09 same goes for hours, e.g. hour=04.

CodePudding user response:

You could try

.withColumn("month", F.date_format(col(date_column), "MM")))

and

.withColumn("hour", F.date_format(col(date_column), "HH"))
  • Related