Home > Software engineering >  Join cell with two values in SQL?
Join cell with two values in SQL?

Time:07-15

Working in postgresQL, I have a code value column where there are two values within one cell:

CREATE TABLE test.table1(Site varchar(50), CodeValue1 varchar(50));
INSERT INTO test.table1
VALUES ('A', '1005'),
       ('B', '1006|1007'),
       ('C', '1008');

I want to join the code value column to a second table, like:

CREATE TABLE test.table2(Note varchar(50), CodeValue2 varchar(50));
INSERT INTO test.table2
VALUES ('Note1','1005'),
       ('Note2', '1006'), 
       ('Note3', '1007');


--this does not work:
select * from test.table1
left join test.table2 on table2.CodeValue2 = table1.Codevalue1

Is there a way to query the '1006|1007' so the join works for either 1006 or 1007 without splitting table 1 into more rows. My desired output would be this:

Site Note Code Value
A Note 1 1005
B Note 2 1006
B Note 3 1007

CodePudding user response:

This is the best approach I have so far, using a subquery, I think this will work fine for my needs but keen to hear if there are other solutions:

select * from (
select Site
, unnest(string_to_array(CodeValue1, '|')) as CV
from  test.table1) t1
left join (select * from test.table2) t2 on t2.CodeValue2 = t1.CV);

CodePudding user response:

Storing a delimited string (table1.codevalue1) is an extremely bad idea, it will cause the same type of issues on virtually every query written against it. A much better approach would be to normalize your data and store each value as a separate row. But as to your direct question can it be written without splitting table1 into more rows - well yes it can:

select t1.site, t2.note, t2.codevalue2 
  from table1 t1
  join table2 t2 
    on ( position ('|' || t2.codevalue2 || '|'  in '|' || t1.codevalue1 || '|') > 0 );

But I certainly would not want work with that on condition, and variants of it.

Your latest attempt (as your answer) is close. It fails because the outer join results in the row ('c', 1008, null, null) also being returned. You can correct that by doing an inner join. Finally, you can reduce to a much simpler version by eliminating your outer most select * from (.... That format is very seldom necessary. Just:

select t1.site, t2.note, t2.codevalue2
  from table2 t2
  join (select site, string_to_table(codevalue1, '|') cv from table1) t1
    on t1.cv = t2.codevalue2;

Note, string_to_table(...) function instead of unnest(string_to_array(...)) functions.

See demo for all here.

  • Related