Home > Software design >  Excluding Records Based on Another Column's Value
Excluding Records Based on Another Column's Value

Time:05-27

I'm working in Redshift and have two columns from an Adobe Data feed:

post_evar22 and post_page_url.

Each post_evar22 has multiple post_page_url values as they are all the pages that the ID visited. (It's basically a visitor ID and all the pages they visited)

I want to write a query where I can list distinct post_evar22 values that have never been associated with a post_page_url that contains '%thank%' or '%confirm%'.

In the dataset below, ID1 would be completely omitted from the query results bceause it was associated with a thank-you page and a confirmation page.

enter image description here

CodePudding user response:

This is a case for NOT EXISTS:

select distinct post_evar22
from table t1
where not exists (
    select 1
    from table t2
    where t2.post_evar22 = t1.post_evar22
    and (t2.post_page_url like '%thank%' or t2.post_page_url like '%confirm%')
)

Or MINUS if your dbms supports it:

select post_evar22 from table
minus
select post_evar22 from table where (post_page_url like '%thank%' or post_page_url like '%confirm%')

CodePudding user response:

Seems fairly straight forward. Am I missing something?

SELECT DISTINCT post_evar22
FROM table 
WHERE post_page_url NOT LIKE '%thank%' 
    AND post_page_url NOT LIKE'%confirm%
  •  Tags:  
  • sql
  • Related