Home > other >  Case when for a particular column in first table based on column values in second table
Case when for a particular column in first table based on column values in second table

Time:08-09

I am trying to crack the following problem with SQL on Snowflake, but I kept running into problems when trying to use case when [condition] is null then.... The problem I am encountering is as follows: I have two tables A and B, where A has column Market_ID, House_ID , Revenue and B has column Market_ID, House_ID, Date where Date has the format YYYY-MM-DD (e.g. 2022-08-01). Now, I am trying to join these two tables based on the following rule: if the Date column in Table B is above a fixed threshold (say, 2022-01-15), we select Market_ID from Table A and House_ID from either Table A or Table B depending on the last date that is lower than the fixed threshold. If the Date column in Table B is less than certain value, then we select Market_ID from Table A and set House_ID to be NaN. In any case, the end goal is to cover all the Market_ID and House_ID in Table A, while Table B is playing the role of conditional filter more or less.

Sample Data

Table A    Market_ID   House_ID    Revenue
        1  14023       125         80.24
        2  14050       128         91.25
        3  14101       196         150.25
        4  15005       197         160.5
        5  15516       189         120.3
        6  15212       600         69.5
        7  14115       209         88.2
      
Table B    Market_ID   House_ID    Date
        1  14023       125         2021-11-12
        2  14050       223         2022-05-22
        3  14101       151         2022-07-15
        4  14101       196         2021-03-21
        5  15005       189         2022-06-22
        6  15005       595         2022-11-18
        7  15516       209         2021-06-13
        8  15516       313         2021-05-15
        9  15212       595         2021-12-25
       10  15212       600         2022-03-05
       11  14115       300         2022-05-07
       12  14115       209         2022-10-06

Expected output. Given a fixed date, like 2022-01-15, then I would like to have the following output, as a result of the SQL Query:

               Market_ID   House_ID    
            1  14023       NaN -- since 2022-01-15 > 2021-11-12         
            2  14050       223 -- since 2022-01-15 < 2022-05-22        
            3  14101       151 -- since 2022-01-15 < 2022-07-15          
            4  15005       595 -- since 2022-01-15 < max(2022-11-18, 2022-06-22)        
            5  15516       NaN -- since 2022-01-15 > max(2021-06-13, 2021-05-15)         
            6  15212       600 -- since 2022-01-15 < 2022-03-05        
            7  14115       209 -- since 2022-01-15 < 2022-10-06         

Question. Could anyone please help me achieve the above goal with case when statement? I tried using

case a.House_ID
  when b.Date > [some condition here] then b.House_ID
  else NULL
end as House_ID
from Table A a
left join Table B b
on a.market_id = b.market_id` 

But it did not give me the expected output.

CodePudding user response:

try this statement

select case when a.create_date > 1905-05-31 then a.create_date else NULL end from test1 a left join test2 b on a.id=b.id

CodePudding user response:

I think this one can help you:

select
a.Market_ID, 
b.House_ID
from TableA a
left join TableB b
on a.market_id = b.market_id
and b.Date > '2022-01-15'
qualify row_number() over (partition by a.House_ID order by Date DESC) = 1
order by a.Market_ID; 

 ----------- ---------- 
| MARKET_ID | HOUSE_ID |
 ----------- ---------- 
|     14023 | NULL     |
|     14050 | 223      |
|     14101 | 151      |
|     14115 | 209      |
|     15005 | 595      |
|     15212 | 600      |
|     15516 | NULL     |
 ----------- ---------- 
  • Related