Home > Software design >  Is there a more idiomatic way to merge related rows from two tables?
Is there a more idiomatic way to merge related rows from two tables?

Time:02-27

I'm using a contrived example in order to illustrate the issue.

Imagine a simple table of books containing a title and subject/genre. In addition, there's an associated table of related subjects.

> SELECT * FROM books;
 ---- -------- ----------- 
| id | title  | subject   |
 ---- -------- ----------- 
|  1 | Book A | science   |
|  2 | Book B | reference |
|  3 | Book C | fiction   |
 ---- -------- ----------- 

> SELECT * FROM related_subjects;
 ---- --------- --------- 
| id | book_id | subject |
 ---- --------- --------- 
|  1 |       1 | physics |
|  2 |       1 | space   |
|  3 |       3 | crime   |
 ---- --------- --------- 

I'd like a query that could output all the title subject combinations, so that it would look something like:

 ---- -------- ----------- 
| id | title  | SUBJECT   |
 ---- -------- ----------- 
|  1 | Book A | science   |
|  1 | Book A | space     |
|  1 | Book A | physics   |
|  2 | Book B | reference |
|  3 | Book C | fiction   |
|  3 | Book C | crime     |
 ---- -------- ----------- 

The most obvious way, is to use a UNION as follows:

SELECT books.id, books.title, SUBJECT FROM books
UNION
SELECT books.id, books.title, related_subjects.subject FROM books
INNER JOIN related_subjects ON related_subjects.book_id = books.id;

Which yields a good result:

 ---- -------- ----------- 
| id | title  | SUBJECT   |
 ---- -------- ----------- 
|  1 | Book A | science   |
|  2 | Book B | reference |
|  3 | Book C | fiction   |
|  1 | Book A | space     |
|  1 | Book A | physics   |
|  3 | Book C | crime     |
 ---- -------- ----------- 

However, it would be preferable if the natural output ordering was similar to my desired output, where the books row comes out first, followed by its related rows from the related_subjects table, and so on.

I'm curious as to whether there's a better/more efficient way of doing this sort of task? Particularly one that would give me a more natural ordering without having to apply a sort on the end result first.

Note: of course, I know I can apply a DB sort to the union output by ordering on books.id, related_subjects.id, but the output in my real world app consists of hundreds of thousands of rows, and so no harm in avoiding a relatively expensive sort if it can be avoided.

CodePudding user response:

Introduce a computed column into the union query for ordering:

SELECT id, title, subject
FROM
(
    SELECT id, title, subject, 1 AS src FROM books
    UNION ALL
    SELECT b.id, b.title, rs.subject, 2
    FROM books b
    INNER JOIN related_subjects rs ON rs.book_id = b.id
) t
ORDER BY id, src;

CodePudding user response:

It seems like you have a one-to-many relationship between books and subjects. So you could drop the subject column from the books table, and just make sure all the subjects that apply to the book are in the related_subjects table. Then you don't have to use UNION to get both, you just do the join from book to related_subjects.

In theory SQL does not guarantee any order of query results unless you specify the ORDER BY. But in practice, InnoDB returns rows in the order it reads them in the index it uses to look them up.

  • Related