Excluding Records Based on Another Column's Value-CodePudding

I'm working in Redshift and have two columns from an Adobe Data feed:

post_evar22 and post_page_url.

Each post_evar22 has multiple post_page_url values as they are all the pages that the ID visited. (It's basically a visitor ID and all the pages they visited)

I want to write a query where I can list distinct post_evar22 values that have never been associated with a post_page_url that contains '%thank%' or '%confirm%'.

In the dataset below, ID1 would be completely omitted from the query results bceause it was associated with a thank-you page and a confirmation page.

CodePudding user response：

This is a case for NOT EXISTS:

select distinct post_evar22
from table t1
where not exists (
    select 1
    from table t2
    where t2.post_evar22 = t1.post_evar22
    and (t2.post_page_url like '%thank%' or t2.post_page_url like '%confirm%')
)

Or MINUS if your dbms supports it:

select post_evar22 from table
minus
select post_evar22 from table where (post_page_url like '%thank%' or post_page_url like '%confirm%')

CodePudding user response：

Seems fairly straight forward. Am I missing something?

SELECT DISTINCT post_evar22
FROM table 
WHERE post_page_url NOT LIKE '%thank%' 
    AND post_page_url NOT LIKE'%confirm%