Home > Enterprise >  How can i get all daily dates between two given dates in DataFrame
How can i get all daily dates between two given dates in DataFrame

Time:11-09

I have this DataFrame of Spark:

 ------------------- ------------------- 
|       first       |        last       |
 ------------------- ------------------- 
|2022-11-03 00:00:00|2022-11-06 00:00:00|
 ------------------- ------------------- 

I need to get the dates from the first, not including, to the last, including, jumping from day to day.

My desired output would be:

 ------------------- ------------------- ----------------------------------------------------------------- 
|       first       |        last       |                         array_dates                      
 ------------------- ------------------- ----------------------------------------------------------------- 
|2022-11-03 00:00:00|2022-11-06 00:00:00| [2022-11-04 00:00:00, 2022-11-05 00:00:00, 2022-11-06 00:00:00] |
 ------------------- ------------------- ----------------------------------------------------------------- 

CodePudding user response:

Since Spark, 2.4, you can use sequence built-in spark function as follows:

import org.apache.spark.sql.functions.{col, date_add, expr, sequence}

val result = df.withColumn(
  "array_dates", 
  sequence(
    date_add(col("first"), 1), 
    col("last"), 
    expr("INTERVAL 1 DAY")
  )
)

with following input df:

 ------------------- ------------------- 
|first              |last               |
 ------------------- ------------------- 
|2022-11-03 00:00:00|2022-11-06 00:00:00|
 ------------------- ------------------- 

You get the following output result:

 ------------------- ------------------- --------------------------------------------------------------- 
|first              |last               |array_dates                                                    |
 ------------------- ------------------- --------------------------------------------------------------- 
|2022-11-03 00:00:00|2022-11-06 00:00:00|[2022-11-04 00:00:00, 2022-11-05 00:00:00, 2022-11-06 00:00:00]|
 ------------------- ------------------- --------------------------------------------------------------- 
  • Related