I have a column in a dataframe firstday
, in timestamp format, looks like this 2022-03-01 00:00:00
I'm trying to pass run_date
a string parameter at run time as filter. But the sql below didn't work. How do you achieve this? Many thanks!
Trying to convert run_date
, which looks like 2022-03-28
to the first day of the month and use it to compare with firstday
spark.sql("select * from df where firstday = date_trunc('mon','{}')".format(run_date))
CodePudding user response:
Almost there, little formatting issues
orginal df
--- ----------
| id| firstday|
--- ----------
| 1|2022-03-01|
| 2|2022-03-17|
--- ----------
Apply filter
run_date="'2022-03-21'"
filtereddf=spark.sql(("select * from df where firstday ==date_trunc('mon',{})").format(run_date))
filtereddf.show()
or
run_date="2022-03-21"
filtereddf=spark.sql(("select * from df where firstday ==date_trunc('mon','{}')").format(run_date))
filtereddf.show()
--- ----------
| id| firstday|
--- ----------
| 1|2022-03-01|
--- ----------