Edited again to add: the 2nd table ("dogs") have null user_guid values. If i filter out the NULL values then the NOT IN
operator works without WHERE
. And I still dont know why :|
Edited: Sorry I realized that I flipped the question. The question was the number of users who are in "users" table and NOT in the "dogs" table. My original code was:
select count(distinct u.user_guid)
from users u
where u.user_guid not IN (select d.user_guid
from dogs d)
Which got result of 0 count. The answer was supposed to be 2226
Only when I add the WHEN
condition then i got the right answer:
select count(distinct u.user_guid)
from users u
where u.user_guid not IN (select d.user_guid
from dogs d
WHERE d.user_guid = u.user_guid);
I still don't know why the WHERE
condition makes the difference here
The question was:
How could you determine the number of unique users in the "users" table who were also in the "dogs" table?
The correct answer was supposed to be
select count(u.user_guid)
from users u
where u.user_guid IN (select d.user_guid
from dogs d
where u.user_guid=d.user_guid);
Now I don't understand the point of the last WHERE filter. I thought the "IN" operator would have already made sure to filter only for values of u.user_guid that also exists in the d.user_guid (ie. the "IN" operator would have ensured that there are matching values) , which is the same as what the WHERE condition did (ie. u.user_guid = d.user_guid). And yet when I run the 2 codes (with and without the last WHERE line), the results were different. I'm not sure where I went wrong :/
CodePudding user response:
Below query finds the users who exists and are not unique on the dogs
table , and based on the users found on the dogs table , only the unique values are returned from the users
table:
select u.user_guid
from users u
where exists ( select d.user_guid
from dogs d
where u.user_guid=d.user_guid
)
group by user_guid
having count(*)=1;
The last where condition where u.user_guid=d.user_guid
returns only the users from dogs
table which are found on the users
table, so it should produce wrong results in your case.
If you want only the number change u.user_guid
with count(u.user_guid)
select count(u.user_guid)
from users u
where exists ( select d.user_guid
from dogs d
where u.user_guid=d.user_guid
)
group by user_guid
having count(*)=1;
CodePudding user response:
If you want to count unique users you might need distinct
keyword if there are multiple rows for any user. For example: count( distinct u.user_guid)
.
For IN
you don't need where u.user_guid=d.user_guid
. But I would suggest the solution usingexists
UsingIN
:
select count( u.user_guid)
from users u
where u.user_guid IN (select d.user_guid
from dogs d);
Using Exists
:
select count( u.user_guid)
from users u
where exists IN (select 1
from dogs d
where u.user_guid=d.user_guid);
Examples of IN
and exists
In users table there are three value for user_guid. 1,2 and 3. 2 is there two times. But in dogs table user_guid are 2,3 and 5. 2 and 3 are common in both tables. Please check the queries and outputs.
Schema and insert statements:
create table users(user_guid int);
create table dogs(user_guid int);
insert into users values(1);
insert into users values(2);
insert into users values(2);
insert into users values(3);
insert into dogs values(2);
insert into dogs values(3);
insert into dogs values(5);
Query: Using IN
select count(distinct u.user_guid) unique_users
from users u
where u.user_guid IN (select d.user_guid from dogs d );
Output:
unique_users |
---|
2 |
Query: Using EXISTS
select count(distinct u.user_guid) unique_users
from users u
where exists (select 1 from dogs d where u.user_guid = d.user_guid);
Output:
unique_users |
---|
2 |
db<>fiddle here
CodePudding user response:
Hey guys I think you're overcomplicating things here.
If you do a simple join you only get the values which are in both tables.
If you do a left join the fields which aren't matched in the second table return null.
NB: if we join on a column which is nullable we could get false results but that's not good in any case.
create table dogs (d int); create table user (u int); insert into dogs values (1),(2),(3),(4), (1),(2),(5),(6); insert into user values (1),(2),(3),(4), (1),(2),(7),(8);
SELECT COUNT(DISTINCT d) as du FROM dogs join user on d=u;
| du | | -: | | 4 |
SELECT COUNT(DISTINCT d) as dd FROM dogs LEFT JOIN user on d=u WHERE u IS NULL;
| dd | | -: | | 2 |
db<>fiddle here