How to add a column indicating a repeat id in Snowflake (SQL)?-CodePudding

So I have a table like this, where each ID is unique per row:

table1

 ID    data
001  Walter
002  Skylar
003    Hank
004   Marie

I have another table, where the ID's can appear multiple times:

table2

ID  value
001     apple
001    banana
003     grape
004  graphite
003     jones
001      pear

All I want to do is given both these tables, I want to add a column to Table 1, to indicate if an ID appears more than once in table 2

Final Result:

 ID    data  table2_multiple
001  Walter                1
002  Skylar                0
003    Hank                1
004   Marie                0

Here we show that both ID = 1 and ID = 3 have table2_multiple = 1, since they both appear more than once in table2!

CodePudding user response：

despite that it is such an odd thing to do but here is how you can do it:

update table1
set table2_multiple = case when t.cnt > 1 then 1 else 0 end 
from (select ID , count(*) cnt from table2 group by ID) t 
where t.id = table1.id

or if you meant just to select :

select t1.* , case when t2.cnt > 1 then 1 else 0 end as table2_multiple
from table1 t1 
join (select ID , count(*) cnt from table2 group by ID) t2
on t1.id = t2.id

CodePudding user response：

in all examples we use a case exxpression to determine if the count is >1 set to 1 else 0.

Basic aggregate functions:

SELECT t1.ID, t1.Data, case when count(*) > 1 then 1 else 0 end as table2_Multiple
FROM Table1 t1 --t1 is an alias of table1
LEFT JOIN table2 t2 --t2 is an alias of table2
 ON t1.ID = t2.ID
GROUP BY T1.ID, T1.Data

Using an analytic function: (Count() over (partition xxx) this basically says count all records by unique T1ID and data then the expression says if that count is > 1 return 1 else 0. The distinct then eliminates all duplicates.

SELECT Distinct t1.ID
     , t1.Data
     , case when count() over (partition by T1.ID, T1.Data) > 1 then 1 else 0 end as Table_2_multiple
LEFT JOIN Table2 T2
  on T1.ID = T2.ID

Using an inline view (T2) to get counts by table2 in this case the sub query will return only 1 row per ID so no need to handle multiples.

SELECT T1.*, case when coalesce(t2.ValueNo,0) > 1 then 1 else 0 end as table2_Multiple 
FROM Table1
LEFT JOIN (SELECT ID, count(*) as valueNo 
           FROM Table2 
           GROUP BY ID) T2
 on T1.ID = T2.ID