I have two all the similar fields tables:
table_1:
field_1, field_2, field_3, field_4
table_2:
field_1, field_2, field_3, field_4
Here field_1
can be used as foreign key to join both tables.
I would like to get all the rows from table_1
and table_2
that have at least one row in table_1
but more than one in table_2
, or vice versa.
So far I have tried these related solutions:
- https://dba.stackexchange.com/questions/144313/how-do-i-find-mismatches-in-two-tables
- Compare two tables, find missing rows and mismatched data.
CodePudding user response:
Assuming both tables have the same row type: all the same column names and types (at least compatible), you can work with row types to simplify:
SELECT (t).*
FROM (SELECT t, count(*) AS ct1 FROM table_1 t GROUP BY 1) t1
JOIN (SELECT t, count(*) AS ct2 FROM table_2 t GROUP BY 1) t2 USING (t)
WHERE t1.ct1 > 1
OR t2.ct2 > 1;
- Group duplicates and remember the count in each table.
- Join the two tables, which removes all rows without match in the other table.
- Filter rows where at least one side has more than one copy.
- In the outer
SELECT
decompose the row type to get columns as usual.
I don't return row counts. If you need those, add ct1
and ct2
in the outer SELECT
.
This requires every column type to support the equality operator =
.
A prominent example that does not is json
. (But jsonb
does.) See:
If you have such columns, cast to text
to work around it. Or you can work with hash values - which also helps performance for very wide rows and/or many duplicates. Related:
CodePudding user response:
One way of getting all records from table_1 which have more than one matching record in table_2 is to count the number of matching records in a subquery, and put a condition on it:
SELECT *
FROM table_1 t1
WHERE (SELECT count(*)
FROM table_2 t2
WHERE t1.field_1 = t2.field_1) > 1
If you're looking to have both sides of this in one query, you can combine them with a UNION:
SELECT *
FROM table_1 t1
WHERE (SELECT count(*)
FROM table_2 t2
WHERE t1.field_1 = t2.field_1) > 1
UNION
SELECT *
FROM table_2 t2
WHERE (SELECT count(*)
FROM table_1 t1
WHERE t1.field_1 = t2.field_1) > 1