Home > Mobile >  SQL - Choosing a single record for an Id based on an attribute
SQL - Choosing a single record for an Id based on an attribute

Time:05-20

I have a shirt table. There are multiple records for each Id (multiple brands). The Material will be same for the Id. The materials have a ranking association with Brand.

Material Ranking: Cotton: Jockey > Nike > Adidas

Any other Material: Adidas > Nike > Jockey

For my result, I want to select only 1 record per Id based on the above ranking. If the Id has cotton, I want the Jockey record to be chosen if Jockey is present, or if Jockey isn't there Nike must be chosen, if Nike also isn't there Adidas must be chosen.

For any other material, the second ranking should be followed.

Shirt table:

Id Brand Material Color Cost
1 Nike Cotton Black 500
1 Jockey Cotton Blue 100
1 Adidas Cotton Red 1000
2 Jockey Synthetic Orange 20
2 Nike Synthetic Green 2
3 Nike Cotton Black 500
4 Nike Wool Black 600
4 Jockey Wool Blue 20000
4 Adidas Wool Red 1000

Result:

Id Brand Material Color Cost
1 Jockey Cotton Blue 100
2 Nike Synthetic Green 2
3 Nike Cotton Black 500
4 Adidas Wool Red 1000

Explanation: Result table Records have been chosen as per the ranking. Since Id 1 has cotton, the Jockey record has been chosen. Since Id 2 does not have cotton, the Nike record has been chosen (as per the ranking) Since Id 3 has cotton, but does not have a Jockey record, the next rank has been chosen. Since Id 4 does not have cotton, the Adidas record has been chosen.

CodePudding user response:

One way is using top(1) with ties

select top(1) with ties *
from tbl
order by row_number() over(partition by id  
    order by
        case Material when 'Cotton' then 
           case Brand when 'Jockey' then 1 
                      when 'Nike' then 2 
                      when 'Adidas' then 3 end
        else 
           case Brand when 'Adidas' then 1 
                      when 'Nike' then 2
                      when 'Jockey' then 3 end
        end)

CodePudding user response:

Personally I would retrieve all the possible data in the backend and do the ranking and filtering there. Business logic will be much easier to maintain there, and easier than doing it in a query. This will probably be easier to optimize as well in regards to performance.

CodePudding user response:

You can get the ranking over (Brand, Material) with a CASE statement, then you can compare this field with the best ranking, obtainable using the MIN window function. Hence retrieve the rows where ranking = best_ranking:

SELECT Id,
       Brand,
       Material,
       Color,
       Cost
FROM (SELECT tab.*,
             CASE WHEN Brand = 'Jockey' AND Material =  'Cotton' THEN 1 
                  WHEN Brand = 'Nike'                            THEN 2
                  WHEN Brand = 'Adidas' AND Material <> 'Cotton' THEN 1
                                                                 ELSE 3
             END AS ranking,
             MIN(CASE WHEN Brand = 'Jockey' AND Material =  'Cotton' THEN 1 
                      WHEN Brand = 'Nike'                            THEN 2
                      WHEN Brand = 'Adidas' AND Material <> 'Cotton' THEN 1
                                                                     ELSE 3
                 END) OVER(PARTITION BY Id) AS best_ranking  
      FROM tab    ) ranked_brands
WHERE ranking = best_ranking

Try it here.

CodePudding user response:

Thinking in real life you would have more than 3 brands, a rankings table would be feasible to use.

DECLARE @Rankings TABLE
(
    Material VARCHAR(10),
    Brand VARCHAR(10),
    Ranking INT
);
INSERT @Rankings
(
    Material,
    Brand,
    Ranking
)
VALUES
('Cotton', 'Jockey', 1),
('Cotton', 'Nike', 2),
('Cotton', 'Adidas', 3),
(NULL, 'Adidas', 1),
(NULL, 'Nike', 2),
(NULL, 'Jockey', 3);

DECLARE @Items TABLE
(
    Id INTEGER,
    Brand VARCHAR(6),
    Material VARCHAR(9),
    Color VARCHAR(6),
    Cost INTEGER
);

INSERT INTO @Items
(
    Id,
    Brand,
    Material,
    Color,
    Cost
)
VALUES
(1, 'Nike', 'Cotton', 'Black', 500),
(1, 'Jockey', 'Cotton', 'Blue', 100),
(1, 'Adidas', 'Cotton', 'Red', 1000),
(2, 'Jockey', 'Synthetic', 'Orange', 20),
(2, 'Nike', 'Synthetic', 'Green', 2),
(3, 'Nike', 'Cotton', 'Black', 500),
(4, 'Nike', 'Wool', 'Black', 600),
(4, 'Jockey', 'Wool', 'Blue', 20000),
(4, 'Adidas', 'Wool', 'Red', 1000);

WITH ordered
AS (SELECT i.*,
           ROW_NUMBER() OVER (PARTITION BY i.Id ORDER BY r.Ranking) orderId
    FROM @Items i
        INNER JOIN @Rankings r
            ON i.Brand = r.Brand
               AND
               (
                   (
                       r.Material = i.Material
                       AND i.Material = 'Cotton'
                   )
                   OR
                   (
                       r.Material IS NULL
                       AND i.Material <> 'Cotton'
                   )
               ))
SELECT ordered.Id,
       ordered.Brand,
       ordered.Material,
       ordered.Color,
       ordered.Cost
FROM ordered
WHERE ordered.orderId = 1;
  • Related