Requirement: find out week start date and end date, but only within the given month date.
Input df:
df = spark.createDataFrame([('1', '2022-07-01'), ('2', '2022-07-30'), ('3', '2022-07-31')], ["id", "date"])
df.createOrReplaceTempView("date_table")
# --- ----------
# | id| date|
# --- ----------
# | 1 |2022-07-01|
# | 2 |2022-07-30|
# | 3 |2022-07-31|
# --- ----------
Expected output:
--- ---------- ----------- -----------
| id| date|week_start | week_end |
--- ---------- ----------- -----------
| 1 |2022-07-01|2022-07-01 |2022-07-02 |
| 2 |2022-07-30|2022-07-24 |2022-07-30 |
| 3 |2022-07-31|2022-07-31 |2022-07-31 |
--- ---------- ----------- -----------
The below query does not work, as it goes out of the boundaries of the given month (row id 1 and 3).
spark.sql(""" select id,date,date_sub(date,dayofweek-1) as week_start,
date_add(date, 7-dayofweek) as week_end FROM
(select id,date,dayofweek(date) as dayofweek
FROM date_table) T""")
CodePudding user response:
In order to get the expected output, together with what you already have, you could use greatest
and least
, providing the start and end of month respectively.
SELECT id,
date,
greatest(date_sub(date, dayofweek-1), trunc(date, 'MM')) as week_start,
least(date_add(date, 7-dayofweek), last_day(date)) as week_end
FROM
(SELECT id, date, dayofweek(date) as dayofweek
FROM date_table) T
Full example:
df = spark.createDataFrame([('1', '2022-07-01'), ('2', '2022-07-30'), ('3', '2022-07-31')], ["id", "date"])
df.createOrReplaceTempView("date_table")
df = spark.sql(
"""
SELECT id,
date,
greatest(date_sub(date, dayofweek-1), trunc(date, 'MM')) as week_start,
least(date_add(date, 7-dayofweek), last_day(date)) as week_end
FROM
(SELECT id, date, dayofweek(date) as dayofweek
FROM date_table) T
"""
)
df.show()
# --- ---------- ---------- ----------
# | id| date|week_start| week_end|
# --- ---------- ---------- ----------
# | 1|2022-07-01|2022-07-01|2022-07-02|
# | 2|2022-07-30|2022-07-24|2022-07-30|
# | 3|2022-07-31|2022-07-31|2022-07-31|
# --- ---------- ---------- ----------