I have column in dataframe like below
-------------------
| timestampCol|
-------------------
|2020-11-27 00:00:00|
|2020-11-27 00:00:00|
-------------------
I need to filter the data based on this date and I want to get last 6 moths data only , could anyone please suggest how can I do that ?
CodePudding user response:
dataset.filter(dataset.col("timestampCol").cast("date")
.gt(add_months(current_date(),-6)));
This will filter all the timestampCol values that are older than 6 months.
CodePudding user response:
Depending on the dataset schema you may need to cast the value as a date. If it's a date just compare it directly with a java.sql.Timestamp instance.
val someMomentInTime =
java.sql.Timestamp.valueOf("yyyy-[m]m-[d]d hh:mm:ss")
val df: Dataframe =
???
df.filter(col("timestampCol") > someMomentInTime) //Dataframe is Dataset[Row]