I have a scenario where I have 2 tables and I want to take join on the basis of USERID and ENDOFMONTH columns between Table A and Table B. There are some cases where the matching ENDOFMONTH value is not present in Table B. For Example, 1st Row in Table A, 2021-03-31 with USERID 100 is not present in Table B. In that case, we have to look at the previous ENDOFMONTH in Table B in order to do the mapping. So 1st Row, 2021-03-31 with USERID 100 from Table A will get mapped to 2021-01-31 (USERID 100) of Table B.
In short: If Table A ENDOFMONTH is not present in Table B, look at the previous months present in Table B until a row is found and map with that ENDOFMONTH
Table A:
USERID | WORKING_ENDOFMONTH |
---|---|
100 | 2021-03-31 |
100 | 2021-08-31 |
100 | 2021-10-31 |
100 | 2022-01-31 |
101 | 2021-10-31 |
101 | 2022-01-31 |
101 | 2022-02-28 |
101 | 2022-03-31 |
101 | 2022-06-30 |
Table B:
USERID | ENDOFMONTH | HOURS |
---|---|---|
100 | 2021-01-31 | 110 |
100 | 2021-08-31 | 130 |
100 | 2022-01-31 | 120 |
100 | 2022-02-28 | 130 |
100 | 2022-03-31 | 110 |
100 | 2022-04-30 | 120 |
101 | 2021-07-31 | 90 |
101 | 2022-02-28 | 100 |
101 | 2022-04-30 | 80 |
101 | 2022-06-30 | 120 |
Result Table:
USERID | WORKING_ENDOFMONTH | HOURS |
---|---|---|
100 | 2021-03-31 | 110 |
100 | 2021-08-31 | 130 |
100 | 2021-10-31 | 130 |
100 | 2022-01-31 | 120 |
101 | 2021-10-31 | 90 |
101 | 2022-01-31 | 90 |
101 | 2022-02-28 | 100 |
101 | 2022-03-31 | 100 |
101 | 2022-06-30 | 120 |
Please guide. Not sure how to dynamically look at previous months until a value is found. Thanks.
CodePudding user response:
This solution might not scale depending on the nature of your data -- but with the data available, it works well. With QUALIFY
you can look at what's the first month that comes up when joining with all the available months before the previous one.
with a as (
select split(value, '|') s, s[0]::int id, s[1]::date month
from table(split_to_table(
$$100 | 2021-03-31
100 | 2021-08-31
100 | 2021-10-31
100 | 2022-01-31
101 | 2021-10-31
101 | 2022-01-31
101 | 2022-02-28
101 | 2022-03-31
101 | 2022-06-30$$, '\n'))
), b as (
select split(value, '|') s, s[0]::int id, s[1]::date month, s[2]::int hours
from table(split_to_table(
$$100 | 2021-01-31 | 110
100 | 2021-08-31 | 130
100 | 2022-01-31 | 120
100 | 2022-02-28 | 130
100 | 2022-03-31 | 110
100 | 2022-04-30 | 120
101 | 2021-07-31 | 90
101 | 2022-02-28 | 100
101 | 2022-04-30 | 80
101 | 2022-06-30 | 120$$, '\n'))
)
select a.id, a.month working_endofmonth, b.month month_data_from, b.hours
from a
join b
on a.id=b.id
where b.month <= a.month
qualify row_number() over(partition by a.id, a.month order by b.month desc) = 1
order by a.id, a.month
;
CodePudding user response:
Assuming the WORKING_ENDOFMONTH is a Date field and not a string, One way to solve this is using a nested Select query using top 1 order by desc. The order by is the key to getting the last value hence important as you ordering by date.
First you can left join the TableA and TableB together using the USERID and WORKING_ENDOFMONTH Columns this gets the records you are looking for with null values where there no matching data in the TableB. If you have hours, you can use that directly, if it is null value you can using the ISNULL and do a nested select.
The query will look something like this:
Select TableA.USERID, TableA.WORKING_ENDOFMONTH,
IsnULL(TableB.HOURS, (Select top 1 HOURS
from TableB as MaxLastValue
where MaxLastValue.USERID = TableA.USERID
and MaxLastValue.WORKING_ENDOFMONTH <= TABLEA.WORKING_ENDOFMONTH
order by WORKING_ENDOFMONTH desc)) AS HOURS
from TableA Left join TableB
on (TableA.USERID = TableB.USERID
and TableA.WORKING_ENDOFMONTH = TableB.WORKING_ENDOFMONTH)
Order BY TableA.USERID