When using 'EXISTS' or 'NOT EXISTS', why is the subquery's 'WHERE&#039-CodePudding

Lets say we have two tables

Customers table:

id	name
1	Joe
2	Henry
3	Sam
4	Max

Orders table:

id	customerID
1	3
2	1

I want to find the id of customers who never order. Basically, show me an output of all the Customers.id that do not appear in the Orders table which should be:

Expected output table: | id | |:---| | 2| | 4|

The SQL code to do this is:

SELECT Customers.id
FROM Customers
WHERE NOT EXISTS (
    SELECT Orders.customerId
    FROM Orders
    WHERE Customers.id = Orders.customerId 
)

My question is: Why do we need to write the second WHERE clause? Wouldn't the subquery return the same result even without the WHERE clause?

How is:

  WHERE NOT EXISTS (
        SELECT Orders.customerId
        FROM Orders
        WHERE Customers.id = Orders.customerId 
    )

different to:

  WHERE NOT EXISTS (
        SELECT Orders.customerId
        FROM Orders
    )

However, the query returns zero rows without the 2nd WHERE clause and I don't understand why. Could anyone explain?

CodePudding user response：

Because you need to "join" the subquery to the main query

Imagine the inner query executed for each row of the main query.

If you don't use the inner WHERE clause the inner query (in your case) will always return rows, so the NOT EXISTS will always return FALSE for each customer.

Viceversa, if the Orders table is empty the inner query will not return rows so the NOT EXISTS will be TRUE for all the Customers.

CodePudding user response：

The expression in the WHERE have to boolean, which means that they have to return TRUE or FALSE. If the subquery returns a number of lines is that true or false?
Your query could also be writen

SELECT Customers.id
FROM Customers
LEFT JOIN Orders
ON Customers.id = Orders.customerId 
WHERE Orders.customerId IS NULL;

because the LEFT JOIN says take all records from the first table (Customers) whether we find a matching record in the second table. If there is no matching record all columns requested from the second table will be null.