Home > database >  SQL query to find data that doesn't meet a threshold
SQL query to find data that doesn't meet a threshold

Time:04-22

I need to find a list of value combinations that will identify opportunities as combinations that don't exist. The data I have looks like this.

Company Product
A 1
A 2
A 3
B 2
C 1
C 3
D 1
E 2
E 4

I want to identify products that at least 3 companies sell then tell the companies that aren't selling what products are popular. For this example, products 1 and 2 are sold by at least 3 companies. Product 1 is not being sold by companies B and E, and product 2 is not being sold by companies C and D. The query results would look like.

Company Product
B 1
E 1
C 2
D 2

Oracle is the database, and the actual number of rows is ~ 1M.

CodePudding user response:

You can aggregate and use a HAVING clause to find the products that meet the threshold and then use a PARTITIONed OUTER JOIN to find the non-matching rows:

SELECT t.company, p.product
FROM   ( SELECT product
         FROM   table_name
         GROUP BY product
         HAVING COUNT(DISTINCT Company) >= 3) p
       LEFT OUTER JOIN table_name t
       PARTITION BY (t.company)
       ON (p.product = t.product)
WHERE  t.product IS NULL

Which, for your sample data:

CREATE TABLE table_name (Company, Product) AS
SELECT 'A', 1 FROM DUAL UNION ALL
SELECT 'A', 2 FROM DUAL UNION ALL
SELECT 'A', 3 FROM DUAL UNION ALL
SELECT 'B', 2 FROM DUAL UNION ALL
SELECT 'C', 1 FROM DUAL UNION ALL
SELECT 'C', 3 FROM DUAL UNION ALL
SELECT 'D', 1 FROM DUAL UNION ALL
SELECT 'E', 2 FROM DUAL UNION ALL
SELECT 'E', 4 FROM DUAL;

Outputs:

COMPANY PRODUCT
B 1
C 2
D 2
E 1

db<>fiddle here

  • Related