I'm trying to filter a table based on the queried result from another table.
create temporary table test_table (id number, col_a varchar);
insert into test_table values
(1, 'a'),
(2, 'b'),
(3, 'aa'),
(4, 'a'),
(6, 'bb'),
(7, 'a'),
(8, 'c');
create temporary table test_table_2 (id number, col varchar);
insert into test_table_2 values
(1, 'aa'),
(2, 'bb'),
(3, 'cc'),
(4, 'dd'),
(6, 'ee'),
(7, 'ff'),
(8, 'gg');
Here I want to find out all the id's in test_table
with value "a" in col_a
, and then I want to filter for rows with one of these id's in test_table_2
. I've tried this below way, but got an error: SQL compilation error: syntax error line 6 at position 39 unexpected 'cte'.
with cte as
(
select id from test_table
where col_a = 'a'
)
select * from test_table_2 where id in cte;
This approach below does work, but with large tables, it tends to be very slow. Is there a better more efficient way to scale to very large tables?
with cte as
(
select id from test_table
where col_a = 'a'
)
select t2.* from test_table_2 t2 join cte on t2.id=cte.id;
CodePudding user response:
I would express this using exists logic:
SELECT id
FROM test_table_2 t2
WHERE EXISTS (
SELECT 1
FROM test_table t1
WHERE t2.id = t1.id AND
t1.col_a = 'a'
);
This has one advantage over a join in that Snowflake can stop scanning the test_table_2
table as soon as it finds a match.
CodePudding user response:
your first error can be fixed as below. Joins are usually better suited for lookups compared to exists or in clause if you have a large table.
with cte as
(
select id from test_table
where col_a = 'a'
)
select * from test_table_2 where id in (select distinct id from cte);