Home > Software engineering >  Is it possible in sql to perform joins based on a subsets of data?
Is it possible in sql to perform joins based on a subsets of data?

Time:04-23

I'm running into an issue where I misunderstood what a left join was, and I don't really know how to word the question. I have a MVCE below

declare @matchableproperties table
(
    pname varchar(100)
)

declare @users table
(
   userid varchar(100),
   pname varchar(100),
   pvalue varchar(100)
)

insert into @matchableproperties values ('city')
insert into @matchableproperties values ('status')
insert into @matchableproperties values ('position')

insert into @users values (1, 'city', 'Wichita')
insert into @users values (1, 'status', 'Active')
insert into @users values (1, 'position', 'Captain')
insert into @users values (2, 'city', 'Wichita')
insert into @users values (2, 'status', 'Active')

select u.*, mp.* from @matchableproperties mp
left join @users u on mp.pname = u.pname
order by userid, mp.pname

which returns

u.userid u.pname u.pvalue mp.pname
1 city Wichita city
1 position Captain position
1 status Active status
2 city Wichita city
2 status Active status

My issue is in what I am trying to achieve. I want to know, for every userid, which pname's exist but also which pnames do not exist. Like, for userid 2, the "position" pname doesn't exist, so I'd like to have a the additional row below to show that user ID 2 doesn't match on all the specified properties.

u.userid u.pname u.pvalue mp.pname
2 null null position

After thinking about this, I realized that I want to do a left join on individual partitions of the users table -- essentially I want the results of

select u.*, mp.* from @matchableproperties mp
left join (select * from @users where userid = 1) u on mp.pname = u.pname
union all
select u.*, mp.* from @matchableproperties mp
left join (select * from @users where userid = 2) u on mp.pname = u.pname

The above query gives me the results I want (the two tables above, for a total of six rows), but since I can't know how many users will be un the users table I obviously can't hard code it. Is there some magic "partition by" or "group by" syntax I can use on the left join to get what I want from a single left join statement on the two tables?

CodePudding user response:

You need a projection of every possible combination of property and user. That's a CROSS JOIN. Once you have this projection you can LEFT JOIN back to the users table to see what actually matches:

SELECT u0.userid, mp.pname, u.pvalue
FROM (
    SELECT DISTINCT pname FROM @matchableproperties
) mp
CROSS JOIN (
    SELECT DISTINCT userid FROM @users
) u0
LEFT JOIN @users u on u.userid = u0.userid and u.pname = mp.pname

See it work here:

https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=533c75a50d6627fdc831b77c5dea47d3

This need for cross join, which tends to use a lot of memory and resources on the database server, is one reason among several why EAV schemas like this are best avoided.

  • Related