Is it possible to check if multiple values are in the column and based on that filter out one of them using a WHERE
clause?
Obviously this code won't work, but here is the logical example of what I'd like to achieve:
SELECT *
FROM table
WHERE IF column includes ('value1', 'value2') THEN NOT IN 'value1'
example with conditions True:
column |
---|
value1 |
value1 |
value2 |
value2 |
value1 |
value3 |
value4 |
value4 |
result:
column |
---|
value2 |
value2 |
value3 |
value4 |
value4 |
Side note: process has to be automated as in one upload, dataset might contain value1 which should remain in place and in the next one both of them will be populated and only value2 will be valid.
CodePudding user response:
If both val1 and val2 exist then exclude val1 otherwise no filter...
declare @t table (col varchar(10))
insert into @t
values
('val1'),('val1'),('val2'),('val3')
select *
from @t
where col <> case when 2 = (select count(*) from (select col from @t where col in('val1','val2') group by col)a)
then 'val1'
else '' end
Results:
col
val2
val3
This is an example when both are not present
declare @t2 table (col varchar(10))
insert into @t2
values
('val1'),('val1'),('val3')
select *
from @t2
where col <> case when 2 = (select count(*) from (select col from @t2 where col in('val1','val2') group by col)a)
then 'val1'
else '' end
Results:
col
val1
val1
val3
Note: the else value needs to be a value that cannot exist in the column col
Note2: This is answered using t-sql
CodePudding user response:
Using QUALIFY. The idea is to compare value of the column against an array generated ad-hoc with case expression to handle subsitution logic:
SELECT *
FROM tab
QUALIFY NOT ARRAY_CONTAINS(col::VARIANT,
ARRAY_AGG(DISTINCT CASE WHEN col IN ('value1', 'value2') THEN 'value1' END) OVER());
For sample data:
CREATE OR REPLACE TABLE tab AS
SELECT $1 AS col
FROM (VALUES
('value1'), ('value1'), ('value2'),
('value2'), ('value1'), ('value3'),
('value4'), ('value4')
)s;
Output:
A more explicit approach is using windowed COUNT_IF
:
SELECT *
FROM tab
QUALIFY col NOT IN (CASE WHEN COUNT_IF(col IN ('value1', 'value2')) OVER() > 1
THEN 'value1'
ELSE ''
END);