I would like to display a running total of Invoice_Amount. Here is my current query:
SELECT cust_name, COUNT(*) as Invoice_Amount, Invoice.invoice_date
FROM Customer, Invoice
WHERE Customer.customer_id = Invoice.customer_id
GROUP BY Invoice.customer_id, Customer.cust_name,invoice_date;
and here is the current output:
cust_name Invoice_Amount invoice_date
Smith Bros Ltd 1 1996-04-20 00:00:00.000
Smith & Smith Solicitors 1 1996-05-22 00:00:00.000
BigBus Ltd 1 1996-05-30 00:00:00.000
ANIMAID 1 1996-06-12 00:00:00.000
Chopsticks 1 1996-06-20 00:00:00.000
Chopsticks 1 1998-04-20 00:00:00.000
Where I would like the Invoice_Amount in both cases to read 2 like so:
cust_name Invoice_Amount invoice_date
Smith Bros Ltd 1 1996-04-20 00:00:00.000
Smith & Smith Solicitors 1 1996-05-22 00:00:00.000
BigBus Ltd 1 1996-05-30 00:00:00.000
ANIMAID 1 1996-06-12 00:00:00.000
Chopsticks 2 1996-06-20 00:00:00.000
Chopsticks 2 1998-04-20 00:00:00.000
This is so I can eventually do something along the lines of:
HAVING (COUNT(*) > 1)
How would I go about getting to this result
CodePudding user response:
There is no need for a GROUP BY
or a HAVING
because you're not actually grouping by anything in the final result.
;;;/* CTE with leading semi-colons for safety */;;;WITH src AS
(
SELECT c.cust_name, i.invoice_date,
COUNT(i.invoice_date) OVER (PARTITION BY i.customer_id)
AS Invoice_Count
FROM dbo.Customer AS c
INNER JOIN dbo.Invoice AS i
ON c.customer_id = i.customer_id
)
SELECT cust_name, Invoice_Count, invoice_date
FROM src
-- WHERE Invoice_Count > 1;
CodePudding user response:
Well, as from your data, the combinations of invoice_date
and cust_name
seem to be unique - as COUNT(*) always returns 1.
You now seem to need the count value that you call invoice_amount
to tally up for the same cust_name
. 'Chopsticks' is occurring twice in your report, and, for 'Chopsticks', you need the value 2. But, still, you want to keep both rows.
Functions that sort-of aggregate data, but still return the same number of rows as the input, are not GROUP BY or aggregate functions, they are window functions, or OLAP/Analytic functions.
So, start from your grouping query, but then select from it, applying an OLAP function , and select from that outer query in turn, filtering for the OLAP function's result:
WITH
grp AS (
SELECT
cust_name
, count(*) AS invoice_amount
, invoice.invoice_date
FROM customer
JOIN invoice ON customer.customer_id = invoice.customer_id
GROUP BY
invoice.customer_id
, customer.cust_name
, invoice_date;
)
,
olap AS (
SELECT
cust_name
, SUM(invoice_amount) OVER(PARTITION BY cust_name) AS invoice_amount
, invoice_date
FROM grp
)
SELECT
*
FROM olap
WHERE invoice_amount > 1;