How both asterisked lines are correct given the SQL statement-CodePudding

A new social network site has the following data tables:

Table users:

id	name	sex
1	Ann	null
2	Steve	m
3	Mary	f
4	Brenda	f

Table friends:

user1	user2
1	2
1	3
2	3

Select data that will be returned by the following SQL query:

  SELECT users.name, 
         COUNT(*) as count 
    FROM users LEFT JOIN 
         friends ON users.id = friends.user1 OR users.id = friends.user2
   WHERE users.sex = 'f'
GROUP BY users.id, 
         users.name;

Output:

name	count
Ann	1
Ann	2
Steve	1
Steve	2
Mary	1
Mary	2 **
Brenda	0
Brenda	1 **

The asterisked sections are the correct answers, but I don't quite understand why (Brenda, 1) would be an answer here when (Mary, 2) is also a correct answer.

CodePudding user response：

This is a trap question. The query looks like your are selecting all female users (sex = 'f') and count their friendships. For this a left outer join on the friends table is applied, so we keep all female users, no matter whether they have friendships or not.

But then the query uses COUNT(*), which counts the joined rows. That is 2 for Mary and 1 for Brenda. If we wanted to count friendships, we'd have to count a column of the friends table, e.g. COUNT(friends.user1). Thus the outer joined rows having these columns set to null would be omitted from the count, and we'd get 2 for Mary and 0 for Brenda.

CodePudding user response：

In order to address this problem and avoid the Brenda = 1 situation, you could:

generate a single "friends" table, by putting together (user1, user2) couples with (user2, user1) couples
joining this table with your "users" table by matching on "users.id" "cte.user1"
aggregate with a COUNT function on "cte.user1" field, grouping on the user name.

WITH cte AS (
    SELECT user1, user2 FROM friends
    UNION ALL 
    SELECT user2, user1 FROM friends
)
SELECT users.name,
       COUNT(cte.user1) AS cnt
FROM      users
LEFT JOIN cte
       ON users.id = cte.user1
GROUP BY users.name