Home > Back-end >  How to prevent duplicates when getting sum of multiple columns with multiple joins
How to prevent duplicates when getting sum of multiple columns with multiple joins

Time:04-05

Lets say I have 3 tables: Invoices, Charges, and Payments. Invoices can have multiple charges, and charges can have multiple payments.

Doing a simple join, data would look like this:

invoiceid  |   chargeid   |  charge   |  payment
----------------------------------
1          |      1       |    50     |   50
2          |      2       |    100    |   25
2          |      2       |    100    |   75
2          |      3       |    30     |   10
2          |      3       |    30     |   5

If I do an join with sums,

select invoiceid, sum(charge), sum(payment)
from invoices i 
inner join charges c on i.invoiceid = c.invoiceid
inner join payments p on p.chargeid = c.chargeid
group by invoiceid

The sum of payments would be correct but charges would include duplicates:

invoiceid  |    charges   |   payments
--------------------------------------
1          |     50       |     50
2          |     260      |     115

I want a query to get a list of invoices with the sum of payments and sum of charges per invoice, like this:

invoiceid  |    charges   |   payments
--------------------------------------
1          |     50       |     50
2          |     130      |     115

Is there any way to do this by modifying the query above WITHOUT using subqueries since subqueries can be quite slow when dealing with a large amount of data? I feel like there must be a way to only include unique charges in the sum.

CodePudding user response:

You can also achieve this by using LATERAL JOINS

   SELECT
     i.invoiceid,
     chgs.total_charges,
     pays.total_payments
   FROM
     invoices AS i
     JOIN LATERAL (
        SELECT
            SUM( charge ) AS total_charges
        FROM
            charges AS c
        WHERE
            c.invoiceid = i.invoiceid
     ) AS chgs ON TRUE
     JOIN LATERAL (
        SELECT
            SUM( payment ) AS total_payments
        FROM
            payments AS p
        WHERE
            p.chargeid = c.chargeid
     ) AS pays ON TRUE

CodePudding user response:

one way is to do the aggregation by the tables before the joins on the grouping value

SELECT i.invoiceid, SumOfCharge, SumOfInvoice
FROM  invoices i 
INNER JOIN  (SELECT InvoiceID, suM(charges) sumOfCharges 
            FROM charges c 
            GROUP BY Invoiceid) c
   on i.invoiceid = c.invoiceid
INNER JOIN (SELECT invoiceid, sum(payment) as SumOfPayment
            FROM charages c
            INNER JOIN payments p on p.chargeid = c.chargeid
            GROUP BY Invoiceid) P
   on i.invoiceID = p.invoiceid

Another way would be to do it inline per invoice using correlation

SELECT i.invoiceid
     , (SELECT SUM(charge) FROM charges c WHERE c.invoiceid = i.invoiceid) SumOfCharge
     , SUM(Payment) SumOfInvoice
FROM  invoices i 
INNER JOIN  charges c 
   on i.invoiceid = c.invoiceid
INNER JOIN payments p 
   on p.chargeid = c.chargeid
GROUP BY Invoiceid
  • Related