I am dealing with the following challenge:
I have multiple (sparse) tables that all share a common date field common "product_id"s. Additionally each table has one additional field which tracks a specific kind of transactional data for the product, such as "units sold", "units purchased", "received_support_requests", etc.
Table 1
DATE | PRODUCT ID | UNITS SOLD |
---|---|---|
2022-01-01 | 1 | 10 |
2022-01-02 | 2 | 40 |
Table 2
DATE | PRODUCT ID | UNITS PURCHASED |
---|---|---|
2022-01-01 | 2 | 456 |
2022-01-04 | 5 | 34 |
Table 3
DATE | PRODUCT ID | RECEIVED SUPPORT REQUESTS |
---|---|---|
2022-01-04 | 5 | 2 |
2022-01-05 | 2 | 1 |
My goal is to somehow join all of these tables so that I get a master table which shows what happened to each product on a specific day as shown below using MySQL:
DATE | PRODUCT ID | UNITS SOLD | UNITS PURCHASED | RECEIVED SUPPORT REQUESTS |
---|---|---|---|---|
2022-01-01 | 1 | 10 | 0 | 0 |
2022-01-01 | 2 | 0 | 456 | 0 |
2022-01-02 | 2 | 40 | 0 | 0 |
2022-01-04 | 5 | 0 | 34 | 2 |
2022-01-05 | 2 | 0 | 0 | 1 |
The tables are all very long > 50000 and contain a big list of products > 3000
I first though about building a kind of ("date" - "product id") scaffold and then just left join all of the tables to this scaffold table. Unfortunately the combination of each date, with each product gets too big.
How would you accomplish such a join in a more efficient way?
CodePudding user response:
SELECT date,
product_id,
COALESCE(table1.amount, 0) sold,
COALESCE(table2.amount, 0) purchased,
COALESCE(table3.amount, 0) supported
FROM ( SELECT date FROM table1
UNION
SELECT date FROM table2
UNION
SELECT date FROM table3 ) dates
CROSS
JOIN ( SELECT product_id FROM table1
UNION
SELECT product_id FROM table2
UNION
SELECT product_id FROM table3 ) products
NATURAL LEFT JOIN table1
NATURAL LEFT JOIN table2
NATURAL LEFT JOIN table3
HAVING sold purchased supported;