Home > Net >  SQL: check if multiple values in the column SQL
SQL: check if multiple values in the column SQL

Time:06-10

Is it possible to check if multiple values are in the column and based on that filter out one of them using a WHERE clause?

Obviously this code won't work, but here is the logical example of what I'd like to achieve:

SELECT * 
FROM table
WHERE IF column includes ('value1', 'value2') THEN NOT IN 'value1'

example with conditions True:

column
value1
value1
value2
value2
value1
value3
value4
value4

result:

column
value2
value2
value3
value4
value4

Side note: process has to be automated as in one upload, dataset might contain value1 which should remain in place and in the next one both of them will be populated and only value2 will be valid.

CodePudding user response:

If both val1 and val2 exist then exclude val1 otherwise no filter...

declare @t table (col varchar(10))

insert into @t
values
('val1'),('val1'),('val2'),('val3')

select *
from @t
where col <> case when 2 = (select count(*) from (select col from @t where col  in('val1','val2') group by col)a)
             then 'val1'
             else '' end

Results:

col
val2
val3

This is an example when both are not present

declare @t2 table (col varchar(10))

insert into @t2
values
('val1'),('val1'),('val3')

select *
from @t2
where col <> case when 2 = (select count(*) from (select col from @t2 where col  in('val1','val2') group by col)a)
             then 'val1'
             else '' end

Results:

col
val1
val1
val3

Note: the else value needs to be a value that cannot exist in the column col

Note2: This is answered using t-sql

CodePudding user response:

Using QUALIFY. The idea is to compare value of the column against an array generated ad-hoc with case expression to handle subsitution logic:

SELECT *
FROM tab
QUALIFY NOT ARRAY_CONTAINS(col::VARIANT, 
ARRAY_AGG(DISTINCT CASE WHEN col IN ('value1', 'value2') THEN 'value1' END) OVER());

For sample data:

CREATE OR REPLACE TABLE tab AS
SELECT $1 AS col
FROM (VALUES 
      ('value1'), ('value1'), ('value2'), 
      ('value2'), ('value1'), ('value3'),  
      ('value4'), ('value4')
)s;

Output:

enter image description here


A more explicit approach is using windowed COUNT_IF:

SELECT *
FROM tab
QUALIFY col NOT IN (CASE WHEN COUNT_IF(col IN ('value1', 'value2')) OVER() > 1 
                         THEN 'value1' 
                         ELSE '' 
                    END);
  • Related