My query outputs data - row by row - for each record I grab a value from last_modified_date
column that is the latest date in that column BUT is not later than the date
column's value. I save new value in column custom_last_modified_date
. Code works fine, but it only checks for the latest date available for one date - instead of every row in a table. Data looks like this:
id date last_modified_date
A 02/28/22 2017-02-28 22:44
A 03/05/22 2017-02-28 05:14
A 03/05/22 2017-02-28 07:49
A 03/22/22 2017-02-28 06:09
A 03/22/22 2022-03-01 06:49
B 03/25/22 2022-03-20 07:49
B 03/25/22 2022-04-01 09:24
Code:
SELECT
id,
date,
MAX(
IF(
date(
string(
TIMESTAMP(
DATETIME(
parse_datetime('%Y-%m-%d %H:%M', last_modified_date)
)
)
)
) <= date,
date(
string(
TIMESTAMP(
DATETIME(
parse_datetime('%Y-%m-%d %H:%M', last_modified_date)
)
)
)
),
null
)
) OVER (PARTITION BY date, id) as custom_last_modified_date
FROM `my_table`
Output is this:
id date custom_last_modified_date
A 02/28/22 02/28/17
A 03/05/22 02/28/17
A 03/05/22 02/28/17
A 03/22/22 03/01/22
A 03/22/22 03/01/22
B 03/25/22 03/20/22
B 03/25/22 03/20/22
Desired output is:
id date custom_last_modified_date
A 02/28/22 02/28/22
A 03/05/22 03/01/22
A 03/05/22 03/01/22
A 03/22/22 03/01/22
A 03/22/22 03/01/22
B 03/25/22 03/20/22
B 03/25/22 03/20/22
CodePudding user response:
I tried your query and your just have a mistake on partion by
, your query will work if you removed date
in your partition by
clause. Your query should look like this:
NOTE: I just created a CTE to convert dates to make it readable.
with sample_data as (
select 'A' as id, '03/05/22' as date_field, '2017-02-28 22:44' as last_modified_date,
union all select 'A' as id, '03/05/22' as date_field, '2017-02-28 05:14' as last_modified_date,
union all select 'A' as id, '03/05/22' as date_field, '2017-02-28 07:49' as last_modified_date,
union all select 'A' as id, '03/22/22' as date_field, '2017-02-28 06:09' as last_modified_date,
union all select 'A' as id, '03/22/22' as date_field, '2022-03-01 06:49' as last_modified_date,
union all select 'B' as id, '03/25/22' as date_field, '2022-03-20 07:49' as last_modified_date,
union all select 'B' as id, '03/25/22' as date_field, '2022-04-01 09:24' as last_modified_date,
),
conv_data as (
select
id,
format_datetime('%m/%d/%y',parse_date('%m/%d/%y',date_field)) as date_field,
format_datetime('%m/%d/%y',parse_datetime('%Y-%m-%d %H:%M', last_modified_date)) as last_modified_date,
from sample_data
)
select
id,
date_field,
last_modified_date,
max(if(last_modified_date <= date_field, last_modified_date, null)) over (partition by id) as custom_last_modified_date,
from conv_data
Output: