Working in postgresQL, I have a code value column where there are two values within one cell:
CREATE TABLE test.table1(Site varchar(50), CodeValue1 varchar(50));
INSERT INTO test.table1
VALUES ('A', '1005'),
('B', '1006|1007'),
('C', '1008');
I want to join the code value column to a second table, like:
CREATE TABLE test.table2(Note varchar(50), CodeValue2 varchar(50));
INSERT INTO test.table2
VALUES ('Note1','1005'),
('Note2', '1006'),
('Note3', '1007');
--this does not work:
select * from test.table1
left join test.table2 on table2.CodeValue2 = table1.Codevalue1
Is there a way to query the '1006|1007' so the join works for either 1006 or 1007 without splitting table 1 into more rows. My desired output would be this:
Site | Note | Code Value |
---|---|---|
A | Note 1 | 1005 |
B | Note 2 | 1006 |
B | Note 3 | 1007 |
CodePudding user response:
This is the best approach I have so far, using a subquery, I think this will work fine for my needs but keen to hear if there are other solutions:
select * from (
select Site
, unnest(string_to_array(CodeValue1, '|')) as CV
from test.table1) t1
left join (select * from test.table2) t2 on t2.CodeValue2 = t1.CV);
CodePudding user response:
Storing a delimited string (table1.codevalue1
) is an extremely bad idea, it will cause the same type of issues on virtually every query written against it. A much better approach would be to normalize your data and store each value as a separate row. But as to your direct question can it be written without splitting table1 into more rows - well yes it can:
select t1.site, t2.note, t2.codevalue2
from table1 t1
join table2 t2
on ( position ('|' || t2.codevalue2 || '|' in '|' || t1.codevalue1 || '|') > 0 );
But I certainly would not want work with that on
condition, and variants of it.
Your latest attempt (as your answer) is close. It fails because the outer join
results in the row ('c', 1008, null, null) also being returned. You can correct that by doing an inner join
. Finally, you can reduce to a much simpler version by eliminating your outer most select * from (...
. That format is very seldom necessary. Just:
select t1.site, t2.note, t2.codevalue2
from table2 t2
join (select site, string_to_table(codevalue1, '|') cv from table1) t1
on t1.cv = t2.codevalue2;
Note, string_to_table(...)
function instead of unnest(string_to_array(...))
functions.
See demo for all here.