I've got 3 tables I need to work with:
CREATE TABLE invoices (
id INTEGER,
number VARCHAR(256)
)
CREATE TABLE items (
invoice_id INTEGER,
total DECIMAL
)
CREATE TABLE payments (
invoice_id INTEGER,
total DECIMAL
)
I need a result set along the lines of:
invoices.id | invoices.number | item_total | payment_total | oustanding_balance |
---|---|---|---|---|
00001 | i82 | 42.50 | 42.50 | 00.00 |
00002 | i83 | 89.99 | 9.99 | 80.00 |
I tried
SELECT
invoices.*,
SUM(items.total) AS item_total,
SUM(payments.total) AS payment_total,
SUM(items.total) - SUM(payments.total) AS oustanding_balance
FROM
invoices
LEFT OUTER JOIN items ON items.invoice_id = invoices.id
LEFT OUTER JOIN payments ON payments.invoice_id = invoices.id
GROUP BY
invoices.id
But that fails. The sum for payments ends up wrong since I'm doing 2 joins here and I end up counting payments multiple times.
I ended up with
SELECT
invoices.*,
invoices.item_total - invoices.payment_total AS oustanding_balance
FROM
(
SELECT invoices.*,
(SELECT SUM(items.total FROM items WHERE items.invoice_id = invoices.id) AS item_total,
(SELECT SUM(payments.total FROM payments WHERE payments.invoice_id = invoices.id) AS payment_total
) AS invoices
But ... that feels ugly. Now I've got subqueries going on everywhere. It DOES work, but I'm concerned about performance?
There has to be some good way to do this with joins - I'm sure I'm missing something super obvious?
CodePudding user response:
As you say the sum behavior with multiple joins is normal and working with sub queries (Or CTE for SQl Server) is not a bad practice.
Doing such GOUP BY on an ID and a total in sub queries won't significantly downgrade your performance (depending on your tables sizes).
Another solution could be doing one SUM sub query for each column you need. It would be easier to understand this way I think :
SELECT
invoices.id
, i_total.total as item_total
, p_total.total aspayment_total
, ( i_total.total - p_total.total) as outstanding_balance
FROM
invoices
LEFT JOIN (
SELECT invoice_id, SUM(total) as total FROM items GROUP BY invoice_id
) i_total
ON i_total.invoice_id = invoices.id
LEFT JOIN (
SELECT invoice_id, SUM(total) as total FROM payments GROUP BY invoice_id
) p_total
ON p_total.invoice_id = invoices.id
CodePudding user response:
I think a common table expression (or in this case two CTEs) will give you what you want. You are using something called a scalar, which is precisely speaking not wrong, but as you correctly identified is ugly, hard to read, hard to maintain and can be non-performant in many situations.
CTE essentially take a query and makes it "behave" like a table. We define it once and then we can refer to it later.
with item_data as (
SELECT invoice_id, SUM(total) as item_total
FROM items
group by invoice_id
),
payment_data as (
SELECT invoice_id, SUM(total) as payment_total
FROM payments
group by invoice_id
)
select
i.*,
id.item_total - pd.payment_total as outstanding_balance
from
invoices i
join item_data id on i.invoice_id = id.invoice_id
join payment_data pd on i.invoice_id = pd.invoice_id
Untested, but hopefully you get the idea.