Home > Net >  SQL Server : inner join with multiple ANDs and ORs
SQL Server : inner join with multiple ANDs and ORs

Time:10-22

Each product has a LatLng. Over time, sometimes products can be duplicated but it is difficult to spot. The price or images may be slightly different.

I want to compare Products by their ProductType and their Price in each location and put them on a map so duplicates can be found more easily.

For each product I calculate 5% of its price so I can then add/subtract this from other Product prices to find rough matches.

Products

ProductID       ProductType        Price     Latitude        Longitude
--------------------------------------------------------------------
ABC             Red Widget         500       12.34           67.89
DEF             Red Widget         505       12.34           67.89
MNO             Red Widget         480       12.34           67.89
RST             Red Widget         500       12.34           67.89
UVW             Red Widget         300       12.34           67.89

JKL             Blue Widget        800       76.54           32.10
XYZ             Blue Widget        800       45.67           23.45

The expected result is that ABC, DEF, MNO AND RST should be returned because they are all Red Widgets, are in the same location and are have either exactly the same price or prices within 5% of each other's price.

UVW is outside the percentage so likely not a duplicate and should not be returned.

JKL and XYZ are Blue Widgets with the same price, but not in the same location, and thus not returned.

SQL

This is what I have so far but it is returning far too many results. I think it's returning rows that are matching outside of each location. The ANDs in the JOIN look correct to me. It feels like the ORs may be allowing more flexibility in matching but the enclosing brackets look correct...

WITH cte AS
(
    SELECT 
        p.ProductID,
        p.ProductType,
        p.Price,
        (p.Price / 100) * 5 AS PricePercent,
        ROUND(p.Latitude, 3) AS Latitude,
        ROUND(p.Longitude, 3) AS Longitude
    FROM 
        Products p
    WHERE
       p.Latitude IS NOT NULL AND p.Longitude IS NOT NULL 
)
SELECT 
    DISTINCT a.ProductID,
    a.Price,
    a.Latitude, a.Longitude
FROM 
    cte a
INNER JOIN 
    /* ProductIDs don't match */
    cte b ON a.ProductID <> b.ProductID
    /* match only where location is the same*/
          AND a.Latitude = b.Latitude 
          AND a.Longitude = b.Longitude
    /* match only where Product Type is the same*/
          AND a.ProductType = b.ProductType
    /*match only if price is the same, or within 5% above or 5% below price*/
          AND (/*same price*/
               b.Price = a.Price 
               OR
               /*b.price is within percentage over a.price*/
               (b.Price > a.Price AND b.Price < (a.Price a.PricePercent))
               OR
               /*b.price is within percentage under a.price */
               (b.Price < a.Price AND b.Price >(a.Price-a.PricePercent)))

CodePudding user response:

First, this thing :

 AND (/*same price*/
               b.Price = a.Price 
               OR
               /*b.price is within percentage over a.price*/
               (b.Price > a.Price AND b.Price < (a.Price a.PricePercent))
               OR
               /*b.price is within percentage under a.price */
               (b.Price < a.Price AND b.Price >(a.Price-a.PricePercent))
       )

Is (or should be) the same as this :

AND (
    -- price within range  /- percent
    b.Price < (a.Price a.PricePercent) 
    AND b.Price >(a.Price-a.PricePercent)
 )

And second for me it work... :

SQL Fiddle

MS SQL Server 2017 Schema Setup:

Query 1:

WITH cte AS
(
    SELECT 
        p.ProductID,
        p.ProductType,
        p.Price,
        (p.Price / 100) * 5 AS PricePercent,
        ROUND(p.Latitude, 3) AS Latitude,
        ROUND(p.Longitude, 3) AS Longitude
    FROM 
        Products p
    WHERE
       p.Latitude IS NOT NULL AND p.Longitude IS NOT NULL 
)
SELECT 
    DISTINCT a.ProductID,
    a.Price,
    a.Latitude, a.Longitude
FROM 
    cte a
INNER JOIN 
    /* ProductIDs don't match */
    cte b ON a.ProductID <> b.ProductID
    /* match only where location is the same*/
          AND a.Latitude = b.Latitude 
          AND a.Longitude = b.Longitude
    /* match only where Product Type is the same*/
          AND a.ProductType = b.ProductType
    /*match only if price is the same, or within 5% above or 5% below price*/
          AND (
    -- price within range  /- percent
    b.Price < (a.Price a.PricePercent) 
    AND b.Price >(a.Price-a.PricePercent)
 )

Results:

| ProductID | Price | Latitude | Longitude |
|-----------|-------|----------|-----------|
|       ABC |   500 |       12 |        68 |
|       DEF |   505 |       12 |        68 |
|       RST |   500 |       12 |        68 |
  • Related