Home > Blockchain >  What's the use of this WHERE clause
What's the use of this WHERE clause

Time:12-18

this is an answer to the question : We need a list of customer IDs with the total amount they have ordered. Write a SQL statement to return customer ID (cust_id in the Orders table) and total_ordered using a subquery to return the total of orders for each customer. Sort the results by amount spent from greatest to the least. Hint: you’ve used the SUM() to calculate order totals previously.

SELECT prod_name,
       (SELECT Sum(quantity)
        FROM OrderItems
        WHERE Products.prod_id=OrderItems.prod_id) AS quant_sold
FROM Products

;

So there is this simple code up here, and I know that this WHERE clause is comparing two columns in two different tables. But since We are calculating the SUM of that quantity, why do need that WHERE clause exactly. I really couldn't get it. Why the product_id exactly and not any other column ( p.s: the only shared column between those two tables is prod_id column ) I am still a beginner. Thank you!

CodePudding user response:

First you would want to know the sum for each product - so need to adjust the subquery similar to this:

(SELECT prod_id, Sum(quantity) qty
 FROM OrderItems
 group by prod_id   
) AS quant_sold

then once you know how much for each product, then you can link that

SELECT prod_name,
    (SELECT prod_id, Sum(quantity) qty
     FROM OrderItems
     group by prod_id   
    ) AS quant_sold
FROM Products p
WHERE p.prod_id = quant_sold.prod_id

CodePudding user response:

  • Run it without the where clause and compare the results. You'll learn a lot that way. specifically focus on two different product Ids ensuring they both have order items and quantities.

You have two different tables involved. There are multiple products. You don't want the sum of all orders on each product; which is what you would get without the where clause. So the where clause correlates the two tables ensuring you only SUM the quantity of each order item for each product between the tables. Personally, I'd use a join, sum, and a group by as I find it easier to read and I'm not a fan of sub selects in the select of another query; but that's me.

SELECT prod_name,
       (SELECT Sum(quantity)
        FROM OrderItems
        WHERE Products.prod_id=OrderItems.prod_id) AS quant_sold
FROM Products

Should be the same as:

SELECT prod_name, Sum(coalesce(P.quantity,0))
FROM Products P
LEFT JOIN orderItems OI
 on P.prod_id=OI.prod_id
GROUP BY Prod_Name

'Notes

  • the above is untested.
  • a left join is needed because all products should be listed and if a product doesn't have an order, the quantity would be zero.
  • if we use an inner join, the product would be excluded.
  • We use coalesce because you'd have a "Null" quantity instead of zero for such lines without an order item.
  • as to which is "right" well it depends and varies on different cases. each has it's own merits and in different cases, one will perform better than another, and in a different case, vice-versa. See --> Join vs. sub-query

As an example:

  • Say you have Products A & B
  • "A" has Order Item Quantities of 1 & 2
  • "B" has order item Quantities of 10 & 20
  • If we don't have the where clause every result record would have qty 33
  • If we have the where product "A" would have 3
  • product "B" would have qty 30.
  • Related