Home > Enterprise >  how to filter based on events that HAVE happened and HAVE NOT happened
how to filter based on events that HAVE happened and HAVE NOT happened

Time:11-22

I have a table named orders in a SQL database that looks like this:

    user_id    email            segment    destination    revenue    
    1          [email protected]    basic      New York       500
    1          [email protected]    luxury     London         750
    1          [email protected]    luxury     London         500
    1          [email protected]    basic      New York       625
    1          [email protected]    basic      Miami          925
    1          [email protected]    basic      Los Angeles    218
    1          [email protected]    basic      Sydney         200
    2          [email protected]   basic      Chicago        375
    2          [email protected]   luxury     New York       1500
    2          [email protected]   basic      Toronto        2800
    2          [email protected]   basic      Miami          750
    2          [email protected]   basic      New York       500
    2          [email protected]   basic      New York       625
    3          [email protected]      luxury     New York       650
    3          [email protected]      basic      New York       875
    4          [email protected]    luxury     Chicago        1300
    4          [email protected]    basic      New York       1200
    4          [email protected]    basic      New York       1000
    4          [email protected]    luxury     Sydney         725
    5          [email protected]    basic      London         500
    5          [email protected]    luxury     London         750

Here's a SQL Fiddle: http://www.sqlfiddle.com/#!9/22f40a/1

I'd like to be able to apply the following logic to get the final result set:

Return only the distinct user_id and the user's email based on the following conditions:

  1. where segment is equal to luxury and destination is equal to New York

OR

  1. where segment is equal to luxury and destination is equal to London

OR

  1. where segment is equal to basic and destination is equal to New York and the given user has a revenue amount in the basic and New York records that sums to greater than $2,000

BUT

  1. a given user has not previously been to destination equal to Miami

Based on my sample data, I would like to see the following returned:

user_id     email
3           [email protected]
4           [email protected]
5           [email protected]

I tried to use the following to get part of what I need:

SELECT
   DISTINCT(user_id),
   email
FROM orders o

WHERE
(o.segment = 'luxury' AND o.destination = 'New York')
OR
(o.segment = 'luxury' AND o.destination = 'London')

But, this query doesn't handle conditions #3 and #4 above. I feel like a window function might be helpful here, but I don't know quite how to implement it.

If someone could help me with this query, I would be incredibly grateful!

Thanks!

CodePudding user response:

You can use subqueries to achieve what you need:

SELECT
   DISTINCT(o.user_id),
   o.email
FROM orders o
WHERE
  (
    -- Clause 1
    (o.segment = 'luxury' AND o.destination = 'New York')
    OR
    -- Clause 2
    (o.segment = 'luxury' AND o.destination = 'London')
    OR
    -- Clause 3
    (o.user_id IN (
      SELECT DISTINCT(o.user_id)
      FROM orders o
      WHERE o.segment = 'basic' AND o.destination = 'New York'
      GROUP BY o.user_id, o.email, o.segment, o.destination
      HAVING SUM(o.revenue) > 2000
    ))
  )
  AND
  -- Clause 4
  o.user_id NOT IN (
    SELECT DISTINCT(o.user_id)
    FROM orders o
    WHERE o.destination = 'Miami'
  )
 

CodePudding user response:

here's another to do it by scanning the table once, group by and having:

SELECT user_id, email,
       SUM(case
          when segment='luxury' and destination in ('New York','London') then 1 
          else 0 
       end) as is_luxury,
       SUM(case
          when segment='basic' and destination in ('New York') then 1
          else 0
       end) as is_basic,       
       SUM(case
          when segment='basic' and destination in ('New York') then revenue
          else 0
       end) as basic_revenue,
       SUM(case when destination in ('Miami') then 1 else 0 end) as is_miami
FROM orders
GROUP BY 1,2
HAVING (is_luxury > 0 OR (is_basic > 0 AND basic_revenue > 2000))
  AND NOT is_miami;

  • Related