Related rows ordering when using JOIN without ORDER BY-CodePudding

Let's say we have two tables:

user:
id,name                             
1,bob
2,alice

user_group:
id,user_id,group
1,1,g1
2,1,g2
3,2,g2
4,2,g3

We don't have guarantees that on each execution of SELECT * FROM user without ORDER BY result set will have the same order. But what about related rows in joins?

For example, SELECT user.name, user_group.group FROM user INNER JOIN user_group ON (user.id = user_group.user_id);. Will the related(joined) rows be adjacent in the result set(take PostgreSQL for ex.)? By that I imply:

bob,g1
bob,g2
alice,g2
alice,g3

alice,g3
alice,g2
bob,g2
bob,g1

and NOT this:

bob,g1
alice,g2
bob,g2
alice,g3

The order of users doesn't matter, the order of groups within each user too

CodePudding user response：

It is a fundamental rule in SQL that you can never rely on the ordering of a result set unless you add an ORDER BY. If you have no ORDER BY, the ordering of the result set can, among others, depend on

the order in which PostgreSQL reads the individual tables – it could be in index order or in sequential order, and even with a sequential scan you don't always get the same order (unless you disable synchronize_seqscans)
the join strategy chosen (nested loop, hash join or merge join)
the number of rows returned by the query (if you use a cursor, PostgreSQL optimizes the query so that the first rows can be returned quickly)

That said, with your specific example and PostgreSQL as database, I think that all join strategies will not return the result set in the order you describe as undesirable. But I wouldn't rely on that: often, the optimizer finds a surprising way to process a query.

The desire to save yourself an ORDER BY often comes from a wish to optimize processing speed. But correctness is more important than speed, and PostgreSQL can often find a way to return the result in the desired order without having to sort explicitly.