I have this query which selects the users for a given branch.
SELECT u.first, u.last, u.id,
(SELECT u1.user_type
FROM users_branches AS u1
WHERE u.id = u1.user_id AND u1.branch_id = @foo
LIMIT 1)
FROM users AS u
WHERE EXISTS
(SELECT 1
FROM users_branches AS u0
WHERE u.id = u0.user_id AND u0.branch_id = @foo)
And here is the EF Core LINQ query which produces this SQL.
int branchId = 123;
var users = await Users
.Where(x => x.UsersBranches.Any(ub => ub.BranchId == branchId))
.Select(user => new
{
UserId = user.Id,
FirstName = user.First,
LastName = user.Last,
UserType = user.UsersBranches.FirstOrDefault(x => x.BranchId == branchId).UserType
}).ToListAsync();
I don't like the WHERE
clause needing to be duplicated just to say "get me only the users which have a users_branches record for this branch, and also return some of the data from that record". Is there a more efficient way to do this? I can imagine a solution that uses GROUP BY, but I'm not certain that would actually be any better.
CodePudding user response:
one way you can do this by using joins
SELECT u.id AS UserId, u.first AS FirstName, u.last AS LastName
FROM users u
INNER JOIN users_branches ub ON u.id = ub.user_id
WHERE ub.branch_id = 123
another way you can do this by using subquery
SELECT u.id AS UserId, u.first AS FirstName, u.last AS LastName
FROM users u
WHERE u.id IN (
SELECT user_id
FROM users_branches
WHERE branch_id = 123
)
if you see main difference is that the join version queries both tables in one statement, while the subquery version filters UsersBranches first in a subquery before querying Users.
Note:The join is generally more efficient since it allows the database to access both tables at once and optimize the query plan. The subquery can result in slower performance since it splits the query into two parts.
-- hope this helps.
CodePudding user response:
I find this version easier to read:
SELECT users.first, users.last, users.id, MIN(users_branches.user_type)
FROM users
INNER JOIN users_branches ON
users.id = users_branches.user_id
AND users_branches.branch_id = @foo
GROUP BY users.id
as to whether it delivers better performance, that will depend on a number of factors, like indices etc. I recommend running some tests and examining the output of EXPLAIN SELECT ....
for the different queries you try
CodePudding user response:
Maybe can you try something like this
SELECT u.first, u.last, u.id, u0.user_type
FROM users u
JOIN users_branches u0 ON u.id = u0.user_id AND u0.branch_id = @foo
WHERE NOT EXISTS (
SELECT 1
FROM users_branches u1
WHERE u1.user_id = u0.user_id
AND u1.branch_id = u0.branch_id
AND u1.id < u0.id
)