I have a table like below and need to fetch the (cheapest price closest distance) vendor for each product. Find cheapest vendor for each product by price, and in case there are multiple vendors at same price, then go for closest by distance. All columns are varchars by the way.
Can you pls suggest optimal way of doing this using SQL.
(I am a newbie to SQL, and my searching skills are probably not upto mark. All my search attempts bring me to finding lowest among single columns etc. Is there a 'category' or 'keyword' for this type of problem ?)
Any help is highly appreciated, Cheers!
Source table -
vendor|product_a|product_b|product_c|product_d|product_e|address |distance|
------ --------- --------- --------- --------- --------- --------- --------
a |$12 |$14 |$19 |$22 |$10 |Chicago |800 |
b |$8 |$12 |$15 | | |New York |900 |
c |$25 |$20 | |$16 |$10 |Houston |975 |
d |$10 |$12 |$14 | |$20 |Anchorage|825 |
e |$15 | |$11 |$20 |$15 |Detroit |850 |
Expected Result -
vendor |cost |vendor |distance |address |
------------- --------- --------- --------- ---------
product_a |$8 |b |900 |New York |
product_b |$12 |d |825 |Anchorage|
product_c |$11 |e |850 |Detroit |
product_d |$16 |c |975 |Houston |
product_e |$10 |a |800 |Chicago |
CodePudding user response:
Here's one solution can retrieve what you desired: (I believe we also can use PIVOT to solve this, you can check that too)
SELECT T1.PRODUCT, T1.COST, T1.VENDOR, T1.DISTANCE, T1.ADDRESS
FROM
(
SELECT T.*, RANK() OVER(PARTITION BY T.REF ORDER BY T.COST ASC, T.DISTANCE ASC) R
FROM
(
SELECT 1 AS REF, CAST(SUBSTR(PRODUCT_A, 2, LENGTH(PRODUCT_A)) AS UNSIGNED) AS COST, VENDOR, 'PRODUCT_A' AS PRODUCT, DISTANCE, ADDRESS
FROM PRODUCTS
WHERE COALESCE(PRODUCT_A, '') <> ''
UNION ALL
SELECT 2 AS REF, CAST(SUBSTR(PRODUCT_B, 2, LENGTH(PRODUCT_B)) AS UNSIGNED) AS COST, VENDOR, 'PRODUCT_B' AS PRODUCT, DISTANCE, ADDRESS
FROM PRODUCTS
WHERE COALESCE(PRODUCT_B, '') <> ''
UNION ALL
SELECT 3 AS REF, CAST(SUBSTR(PRODUCT_C, 2, LENGTH(PRODUCT_C)) AS UNSIGNED) AS COST, VENDOR, 'PRODUCT_C' AS PRODUCT, DISTANCE, ADDRESS
FROM PRODUCTS
WHERE COALESCE(PRODUCT_C, '') <> ''
UNION ALL
SELECT 4 AS REF, CAST(SUBSTR(PRODUCT_D, 2, LENGTH(PRODUCT_D)) AS UNSIGNED) AS COST, VENDOR, 'PRODUCT_D' AS PRODUCT, DISTANCE, ADDRESS
FROM PRODUCTS
WHERE COALESCE(PRODUCT_D, '') <> ''
UNION ALL
SELECT 5 AS REF, CAST(SUBSTR(PRODUCT_E, 2, LENGTH(PRODUCT_E)) AS UNSIGNED) AS COST, VENDOR, 'PRODUCT_E' AS PRODUCT, DISTANCE, ADDRESS
FROM PRODUCTS
WHERE COALESCE(PRODUCT_E, '') <> ''
) T
) T1
WHERE T1.R = 1;
CodePudding user response:
For folks still hanging on MySQL 5.7 which does not support window functions, we can still do it in a single query. Since we need to use UNION, it's not possible to use ORDER BY product_a,distance_b limit 1
to get the lowest cost and the shortest distance (if multiple vendors having the same lowest cost exist). Thus, agrregation functions should be used as a workaround, which unfortunately increase the complexity of the query. Note, since OP claims that all columns are of varchar type and the dollar sign does not exist in the product cost column, we can use explict data type convertion for the sake of proper calculations. And during the string to number convertion, an empty string will be converted to 0, which should be disqualified as the min cost value. Here is the query tested in workbench using ttt as the tablename:
select 'product_a' as vendor, product_a as cost, vendor, distance, address from ttt
where distance=( select min(distance) from ttt where product_a = (select min(cast(product_a as unsigned)) from ttt where product_a>0))
and
product_a= (select min(cast(product_a as unsigned)) from ttt where product_a>0)
union
select 'product_b' as vendor, product_b as cost, vendor, distance, address from ttt
where distance=( select min(distance) from ttt where product_b = (select min(cast(product_b as unsigned)) from ttt where product_b>0))
and
product_b= (select min(cast(product_b as unsigned)) from ttt where product_b>0)
union
select 'product_c' as vendor, product_c as cost, vendor, distance, address from ttt
where distance=( select min(distance) from ttt where product_c = (select min(cast(product_c as unsigned)) from ttt where product_c>0))
and
product_c= (select min(cast(product_c as unsigned)) from ttt where product_c>0)
union
select 'product_d' as vendor, product_d as cost, vendor, distance, address from ttt
where distance=( select min(distance) from ttt where product_d = (select min(cast(product_d as unsigned)) from ttt where product_d>0))
and
product_d= (select min(cast(product_d as unsigned)) from ttt where product_d>0)
union
select 'product_e' as vendor, product_e as cost, vendor, distance, address from ttt
where distance=( select min(distance) from ttt where product_e = (select min(cast(product_e as unsigned)) from ttt where product_e>0))
and
product_e= (select min(cast(product_e as unsigned)) from ttt where product_e>0)
;