Home > Net >  Scala dataframe get last 6 months latest data
Scala dataframe get last 6 months latest data

Time:11-19

I have column in dataframe like below

 ------------------- 
|       timestampCol|
 ------------------- 
|2020-11-27 00:00:00|
|2020-11-27 00:00:00|
 ------------------- 

I need to filter the data based on this date and I want to get last 6 moths data only , could anyone please suggest how can I do that ?

CodePudding user response:

    dataset.filter(dataset.col("timestampCol").cast("date")
           .gt(add_months(current_date(),-6)));

This will filter all the timestampCol values that are older than 6 months.

CodePudding user response:

Depending on the dataset schema you may need to cast the value as a date. If it's a date just compare it directly with a java.sql.Timestamp instance.

val someMomentInTime =
  java.sql.Timestamp.valueOf("yyyy-[m]m-[d]d hh:mm:ss")
 
val df: Dataframe = 
  ???

df.filter(col("timestampCol") > someMomentInTime) //Dataframe is Dataset[Row]
  • Related