Home > database >  Converting a row in a column to LocalDate in Spark
Converting a row in a column to LocalDate in Spark

Time:09-24

I'm having a problem when I try to filter data where data < todayData

If I use this code, I get the wrong results Code:

val todayData = LocalDate.now.format(
      DateTimeFormatter.ofPattern("dd/MM/yyyy")) //22/09/2021


val filtredDF = sampleData.where(sampleData("data_riferimento_condizioni") < todayData)

One of reuslt:

 -------- -------- --------------------------- ----------- 
|istituto|servizio|data_riferimento_condizioni|      stato|
 -------- -------- --------------------------- ----------- 
|   62952|     923|                 02/12/2022|in progress|
 -------- -------- --------------------------- ----------- 

As you can see I get data that > todayDate, I want to bring data_riferimento_condizioni to LocalDate so I can use public boolean isBefore(ChronoLocalDate other)

CodePudding user response:

At first you need to convert "data_riferimento_condizioni" to DateType or TimestampType instead StringType with to_date() or to_timestamp() functions from there and then filter your data

For spark 3 and newer you can filter out you rows comparing them with instances of java.time.LocalDate or java.time.Instant


val filtredDF = sampleData
    .withColumn("converted", to_date(col("data_riferimento_condizioni"), "dd/MM/yyyy"))
    .where(col("converted") < LocalDate.now)


But if you're using spark 2, you have to convert your LocalDate or Instant to java.sql.Date or java.sql.Timestamp


val filtredDF = sampleData
    .withColumn("converted", to_date(col("data_riferimento_condizioni"), "dd/MM/yyyy"))
    .where(col("converted") < Date.valueOf(LocalDate.now))

You can read more about using dates in spark and differences between spark2 and spark3 there

  • Related