Home > Software design >  How to eliminate redundant values in sql calculation?
How to eliminate redundant values in sql calculation?

Time:09-22

I'm creating a monthly revenue report for each state in the U.S. To make things simple, I focused on Alaska. From the query and results below, you can see that each invoice can be associated with multiple invoice items:

SELECT
   i.id AS "i.id",
   i.subtotal,
   i.credit,
   i.tax,
   ii.id AS "ii.id",
   ii.refunded_amount
FROM
   invoices i
   JOIN customer ON i.customer_id = customer.id
   JOIN invoice_items ii ON i.id = ii.invoices_id
WHERE
   i.status = 'Paid'
   AND i.datepaid BETWEEN '2016-01-01' AND '2016-02-01'
   AND customer.billing_day <> 0
   AND customer.register_date < '2016-02-01'
   AND customer.account_exempt = 'f'
   AND customer.country = 'US'
   AND customer.state = 'AK';

  i.id   | subtotal | credit |  tax  | ii.id  | refunded_amount 
---------- ---------- -------- ------- -------- -----------------
27111851 |      100 |      0 |    20 | 746219 |               0
27111851 |      100 |      0 |    20 | 746218 |              15
27111851 |      100 |      0 |    20 | 746217 |               0
27111852 |        0 |      1 |     0 | 746217 |               0
27111853 |      200 |      0 |    40 | 746220 |               0

I want to tabulate the monthly Sales, Tax, and Gross revenue for the state of Alaska. Below is the query I wrote and the results:

SELECT
    customer.state AS "State",
    ROUND((SUM(i.subtotal - i.credit)):: NUMERIC, 2) AS "Sales",
    ROUND((SUM(i.tax)):: NUMERIC, 2) AS "Tax",
    ROUND(
        (
            SUM((i.subtotal - i.credit   i.tax) - ii.refunded_amount)
        ):: NUMERIC,
        2
    ) AS "Gross"
FROM
    invoices i
    JOIN customer ON i.customer_id = customer.id
    JOIN invoice_items ii ON i.id = ii.invoices_id
WHERE
    i.status = 'Paid'
    AND i.datepaid BETWEEN '2016-01-01' AND '2016-02-01'
    AND customer.billing_day <> 0
    AND customer.register_date < '2016-02-01'
    AND customer.account_exempt = 'f'
    AND customer.country = 'US'
    AND customer.state = 'AK'
GROUP BY
    customer.state;

 State |  Sales  |  Tax   |  Gross    
------- --------- -------- ----------
 AK    |     499 |   100  |      584 

Below is what the results should be:

 State |  Sales  |  Tax   |  Gross    
------- --------- -------- ----------
 AK    |     299 |    60  |      344 

My query is tabulating the same invoice subtotal, tax, and credit multiple times, which is artificially inflating the results. I need to change the query so that it incorporates each invoice into the calculation only once, but still looks at all the associated invoice items. I'm not sure how to accomplish this in sql. Thanks for any pointers!

CodePudding user response:

Preaggregate invoice_items, so it doesn't affect the other results:

SELECT c.state AS "State",
       ROUND((SUM(i.subtotal - i.credit)):: NUMERIC, 2) AS "Sales",
       ROUND((SUM(i.tax)):: NUMERIC, 2) AS "Tax",
       ROUND(SUM((i.subtotal - i.credit   i.tax) - ii.refunded_amount):: NUMERIC, 2
            ) AS "Gross"
FROM invoices i JOIN
     customer c
     ON i.customer_id = c.id JOIN
     (SELECT ii.invoices_id, SUM(ii.refunded_amount) as refunded_amount
      FROM invoice_items ii
      GROUP BY ii.invoiced_id
     ) ii
     ON i.id = ii.invoices_id
WHERE i.status = 'Paid' AND
      i.datepaid BETWEEN '2016-01-01' AND '2016-02-01' AND
      c.billing_day <> 0 AND
      c.register_date < '2016-02-01' AND
      c.account_exempt = 'f' AND
      c.country = 'US' AND
      c.state = 'AK'
GROUP BY c.state;

CodePudding user response:

I used Gordon Linoff's answer but I repeated the JOIN statement with the customers table and moved the WHERE clause into the subquery.

    SELECT
        customer.state AS "State",
        ROUND((SUM(i.subtotal - i.credit)):: NUMERIC, 2) AS "Sales",
        ROUND((SUM(i.tax)):: NUMERIC, 2) AS "Tax",
        ROUND(
            SUM((i.subtotal - i.credit   i.tax) - refunded_amount):: NUMERIC,
            2
        ) AS "Gross"
    FROM
        invoices i
        JOIN customer ON i.customer_id = customer.id
        JOIN (
            SELECT
                ii.invoices_id,
                SUM(ii.refunded_amount) AS refunded_amount
            FROM
                invoices i
                JOIN customer ON i.customer_id = customer.id
                JOIN invoice_items ii ON i.id = ii.invoices_id
            WHERE
                i.datepaid BETWEEN '2016-01-01' AND '2016-02-01' AND
                customer.billing_day <> 0 AND
                customer.register_date < '2016-02-01' AND
                customer.account_exempt = 'f' AND
                customer.country = 'US' AND
                customer.state = 'AK'
            GROUP BY
                ii.invoices_id
        ) ii ON i.id = ii.invoices_id
    GROUP BY customer.state;
  • Related