I have a query that goes like this
select
a.date_field,
b.date_field,
b.interested_data
from tbl_dates a
left join tbl_data b
on a.date_field = b.date_field
this gives me a resultset like:
a_date_field | b_date_field | b_interested_data
2022-01-01 | 2022-01-01 | data_1
2022-01-02 | 2022-01-02 | data_2
2022-01-03 | null | null
is it possible to return the last non-null value for b_date_field and b_interested_data in row 3? Ideally the result should be
a_date_field | b_date_field | b_interested_data
2022-01-01 | 2022-01-01 | data_1
2022-01-02 | 2022-01-02 | data_2
2022-01-03 | 2022-01-02 | data_2
I know that b_date_field would seem like an error but I am only interested in b_interested_data.
CodePudding user response:
perhaps you can do it using a APPLY
operator. Get the "last" row from table b based on date_field
select
a.date_field,
b.date_field,
b.interested_data
from tbl_dates a
cross apply
(
select TOP (1) b.date_field, b.interested_data
from tbl_data b
where a.date_field >= b.date_field
order by b.date_field desc
) b
CodePudding user response:
Using lag()
and coalese()
we can do it: assuming a_date_Field is the order we can use to determine the "Prior" value.
WITH CTE AS (SELECT '2022-01-01' a_date_field, '2022-01-01' b_date_field, 'data_1' b_interested_data UNION ALL
SELECT '2022-01-02', '2022-01-02', 'data_2' UNION ALL
SELECT '2022-01-03', null, null)
SELECT a_date_Field,
coalesce(B_Date_Field,lag(B_date_Field) over (order by a_date_Field)),
coalesce(B_Interested_Data,lag(B_Interested_Data) over (order by a_date_Field)) FROM CTE
Giving us:
-------------- ------------------ ------------------
| a_date_Field | (No column name) | (No column name) |
-------------- ------------------ ------------------
| 2022-01-01 | 2022-01-01 | data_1 |
| 2022-01-02 | 2022-01-02 | data_2 |
| 2022-01-03 | 2022-01-02 | data_2 |
-------------- ------------------ ------------------
CodePudding user response:
Basically I think there are two approaches. One is to use apply, one is to use last_value. I have prepared a small insert with two selects that return the same table, so pick the one you like most (EDIT: Pick the one that fits your solution and has the best performance in your case).
select
*
into #dates
from (values (1,cast('20220101' as date)),(2,cast('20220102' as date)),(3,cast('20220103' as date))) a(date_id,[date])
;
select
*
into #data
from (values ('data_1',cast('20220101' as date)),('data_2',cast('20220102' as date))) b(interested_data,[date])
;
--Solution 1
select distinct
last_value(a.date) over (partition by a.date order by b.date rows between current row and unbounded following)
, last_value(b.date) over (partition by a.date order by b.date rows between current row and unbounded following)
, last_value(b.interested_data) over (partition by a.date order by b.date rows between current row and unbounded following)
from #dates a
left join #data b on
b.date <= a.date
;
--Solution 2
select
a.date
, sub.date as b_date
, sub.interested_data
from #dates a
outer apply (
select top 1
b.date as date
, b.interested_data as interested_data
from #data b
where b.date <= a.date
order by b.date desc
) as sub