I have a large table that I'm wanting to select 8 columns out of. I would be comparing it to another table with 2 columns making sure they (said columns) match exactly.
Pseudo Code:
SELECT a, b, c, d, e, f, g, h, i
FROM table1 t1
WHERE a AND b are matching the same rows in table 2
I've done this with a similar example, but only had 1 column in table 2 instead of 2 like so:
SELECT a, b, c, d, e, f, g, h, i
FROM table t1
WHERE a IN (SELECT * FROM table2 t2)
which gives me the results. But again, I now need to make sure that a AND b match the corresponding rows from table2. I've been searching but cannot find a solution. I've tried INNER JOINS, but I'm getting more results than there should be. Any help would be greatly appreciated as SQL is definitely not my wheelhouse. Also if there are any other clarifications, please let me know.
I think this is simple enough of a question, I hope. Thanks in advance!
In response to the comment:
Table 1 (has 20 million rows)
Address | Zip | Year | Area | Name1 | Name2 | example | example |
---|---|---|---|---|---|---|---|
123 Main St | 77777 | 1999 | 2000 | John Doe | John Smith | data | data |
456 Main St | 88888 | 2012 | 2500 | James Doe | John Smith | data | data |
789 Main St | 99999 | 2018 | 2800 | Michael Doe | Michelle Doe | data | data |
Table 2 (has 7500 rows):
Address | Zip |
---|---|
123 Main St | 77777 |
321 Smith St | 66666 |
789 Main St | 99999 |
455 Highway 1 | 44444 |
Results would be expected (7500 rows)
Address | Zip | Year | Area | Name1 | Name2 | example | example |
---|---|---|---|---|---|---|---|
123 Main St | 77777 | 1999 | 2000 | John Doe | John Smith | data | data |
789 Main St | 99999 | 2018 | 2800 | Michael Doe | Michelle Doe | data | data |
CodePudding user response:
It looks to me like you want to filter the content of table1
according to the address/zip tuples that are present in table2
.
Here is one approach using exists
and a correlated subquery:
select t1.*
from table1 t1
where exists (select 1 from table2 t2 where t2.address = t1.address and t2.zip = t1.zip)
For performance, consider an index on table2(address, zip)
.
You can also use in
, which sometimes brings surprising performance benefits:
select t1.*
from table1 t1
where (zip, address) in (select zip, address from table2)
CodePudding user response:
One option is to JOIN the two tables:
SELECT t1.*
FROM Table1 t1
INNER JOIN Table2 t2 on t1.a = t2.a and t1.b = t2.b
Just be aware, if Table2 can have more than one row with the same a,b
value you can end up with duplicate results. Usually the EXISTS()
option will perform the best, but sometimes the database can surprise you.