Home > Enterprise >  Terminology for when a join propagates out additional rows
Terminology for when a join propagates out additional rows

Time:07-24

When joining between two tables/queries:

with 
cte1 (id) as (
select 1 from dual),
cte2 (id) as (
select 1 from dual union all
select 1 from dual)


select
    cte1.id as cte1_id,
    cte2.id as cte2_id
from
    cte1
left join 
    cte2
    on cte1.id = cte2.id

CTE1_ID   CTE2_ID
      1         1
      1         1

Unsurprisingly, the join propagates out additional rows. The query on the left side of the join only had one row. But the resultset has two rows due to the join.


I suspect “propagate” isn’t quite the right word for describing that scenario.

What’s the proper term?

For example, when talking to people who are new to SQL, I often say, “Be careful with that join. It looks like you’re accidentally propagating out additional rows.”

CodePudding user response:

In this example you are not propagating rows (at least in my understanding anyway). You have two rows in the table on the right side of the join and you have two rows in the result.

However, if you had this:

WITH cte AS
(
SELECT 1 AS id FROM dual
UNION ALL
SELECT 1 FROM dual
)
SELECT x.id, y.id
FROM cte x 
INNER JOIN cte y ON x.id = y.id;

You would start with 2 rows and the query would return 4, because the join is partial. To me, this is propagating data.

When every row from one side of the join is joined with every row on the other, the term you are looking for is a "cartesian product", which is achieved in SQL using a "cross join" or, in cases where the join is not unique but is limited partially, you could use "partial cartesian product" (though I don't recommend it) or more commonly a "partial cross-join". I think the latter is more likely to be readily appreciated by SQL developers.

In either case, there are times where both can be appropriate but a lot of the time they are the result of an error in a join clause.

CodePudding user response:

why a LEFT JOIN an INNER JOIN would do the same!

And no it doesn't propagate, you have in cte2 2 id's with 1 that is what UNION ALL actually amkes so when you join both tables, with the same id you will receive 2 rows as joined result set.

A Left Join also takes all rows of the left tables and troes to join in your case by the id and if it didn't find any companion, it adds the row with the right table columns as NULL.

So no wonders and no miracles, simple SQL

CodePudding user response:

What’s the proper term?

"Cartesian Product" could be one term you can use.

I.e. "Be careful of that join. It looks like you are accidentally returning the cartesian product of the two tables."


A CROSS JOIN will return the cartesian product of the two joined tables; it is also called a "Cartesian Join".

An INNER JOIN will return the cartesian product of the two joined tables that is filtered by some relationship (the join condition(s)) between columns of the two tables; it is also called an "Equi Join".

An OUTER JOIN is similar to the INNER JOIN but will also return the non-matched rows on one (for LEFT or RIGHT joins) or both (for FULL joins) sides of the join condition.

  • Related