I want to do a left join on a table where the format of the two columns are not the same. I use REPLACE to remove the "[ ]" but I'm having trouble making one of the rows into two rows so be able to complete the join.
emp_tbl state_tbl
emp state id name
-------- ------- ------ -----
| Steve | [1] | | 1 | AL |
| Greg | [2|3] | | 2 | NV |
| Steve | [4] | | 3 | AZ |
-------- ------- | 4 | NH |
------ -----
Desired output:
-------- ------
| Steve | AL |
| Greg | NV |
| Greg | AZ |
| Steve | NH |
-------- ------
SELECT emp_tbl.emp, state_tbl.name
FROM emp_tbl
LEFT JOIN state_tbl on state_tbl.id = REPLACE(REPLACE(emp_tbl.state, '[', ''), ']', '')
With this query i can remove the "[ ]" and do the join, but the row with two "states" does obiously not work.
CodePudding user response:
Your query will never produce 4 rows because the left table only has 3 rows. You need to flatten the rows that contains multiple state_ids before the join.
- Prepare the table and data:
create or replace table emp_tbl (emp varchar, state string);
create or replace table state_tbl (id varchar, name varchar);
insert into emp_tbl values
('Steve', '[1]'), ('Greg', '[2|3]'), ('Steve', '[4]');
insert into state_tbl values
(1, 'AL'), (2, 'NV'), (3, 'AZ'), (4, 'NH');
- Then below query should give you the data you want:
with emp_tbl_tmp as (
select emp, parse_json(replace(state, '|', ',')) as states from emp_tbl
),
flattened_tbl as (
select emp, value as state_id from emp_tbl_tmp, table(flatten(input => states))
)
select emp, name from flattened_tbl emp
left join state_tbl state on (emp.state_id = state.id);
Or if you want to save one step:
with flattened_emp_tbl as (
select emp, value as state_id
from emp_tbl,
table(flatten(
input => parse_json(replace(state, '|', ','))
))
)
select emp, name from flattened_emp_tbl emp
left join state_tbl state
on (emp.state_id = state.id);
CodePudding user response:
here is how you can do it :
select emp_tbl.emp, state_tbl.name
from emp_tbl tw
lateral flatten (input=>split(parse_json(tw.state), '|')) s
left join state_tbl on s.value = state_tbl.id