I am trying to summaries data from my Data Warehouse into a fact table and I want to be able to count up the number of transactions per type that there were per customer per month.
Columns being used are: Customer_id, Transaction_id, transaction_date, Transaction_type
Ideally what I want to get is.
Customer | Month | transaction_type_1 | transaction_type_2 | Total_transactions |
---|---|---|---|---|
12345 | 1 | 18 | 8 | 26 |
12345 | 2 | 23 | 14 | 37 |
67891 | 1 | 14 | 22 | 36 |
I have to put it into a subquery but I get the total number of type 1 transactions for all customers in each month. I have tried unsuccessfully to using partition on top of that but now very far outside my level.
Select
customer_id,
month,
count(transactions_id),
(select count(transactions_id) from DWH where transaction_type = 1),
(select count(transactions_id) from DWH where transaction_type = 2)
FROM DWH
GROUP BY customer_id, month
Incorrect table output looks something like this.
Customer | Month | transaction_type_1 | transaction_type_2 | Total_transactions |
---|---|---|---|---|
12345 | 1 | 432 | 564 | 26 |
12345 | 2 | 456 | 765 | 37 |
In a standalone table I can get the information, but I can not incorporate it into the fact table view.
Standalone this works to get individual counts of each type, but I haven't been able to rework that into a select subquery:
select customer_id, month, count(*)
FROM DWH
WHERE dwh.transaction_type = 1
Group BY dwh.customer_id, month;
Any help would be much appreciated.
CodePudding user response:
You may be getting incorrect results because the filters in the where clause of your respective subqueries do not consider the group by columns i.e.
The table alias d
helps us to distinguish between columns used in the outer/general query and the subqueries.
select
d.customer_id,
d.month,
count(d.transactions_id),
(
select count(transactions_id)
from DWH
where transaction_type = 1 and
customer_id = d.customer_id and
month = d.month
) as transaction_type_1,
(
select count(transactions_id)
from DWH
where transaction_type = 2 and
customer_id = d.customer_id and
month = d.month
) as transaction_type_2
FROM DWH d
GROUP BY d.customer_id, d.month
However, while this approach may work, it would be best if you tested this performance wise on your respective database and evaluate the cost metrics/query plan.
Another approach that may be performant uses case expressions to achieve the result and has been included below:
SELECT
customer_id as Customer,
month,
COUNT(
CASE WHEN transaction_type=1 THEN transactions_id END
) as transaction_type_1,
COUNT(
CASE WHEN transaction_type=2 THEN transactions_id END
) as transaction_type_2,
COUNT(1) as Total_transactions
FROM
DWH
GROUP BY
customer_id, month