Home > other >  SQL filtering counts in a fact table using a subquery or better alternative
SQL filtering counts in a fact table using a subquery or better alternative

Time:11-09

I am trying to summaries data from my Data Warehouse into a fact table and I want to be able to count up the number of transactions per type that there were per customer per month.

Columns being used are: Customer_id, Transaction_id, transaction_date, Transaction_type

Ideally what I want to get is.

Customer Month transaction_type_1 transaction_type_2 Total_transactions
12345 1 18 8 26
12345 2 23 14 37
67891 1 14 22 36

I have to put it into a subquery but I get the total number of type 1 transactions for all customers in each month. I have tried unsuccessfully to using partition on top of that but now very far outside my level.

Select 
 customer_id, 
 month, 
 count(transactions_id),
 (select count(transactions_id) from DWH where transaction_type = 1),
 (select count(transactions_id) from DWH where transaction_type = 2)
FROM DWH
GROUP BY customer_id, month

Incorrect table output looks something like this.

Customer Month transaction_type_1 transaction_type_2 Total_transactions
12345 1 432 564 26
12345 2 456 765 37

In a standalone table I can get the information, but I can not incorporate it into the fact table view.

Standalone this works to get individual counts of each type, but I haven't been able to rework that into a select subquery:

select customer_id, month, count(*) 
   FROM DWH 
   WHERE dwh.transaction_type = 1
   Group BY dwh.customer_id, month;

Any help would be much appreciated.

CodePudding user response:

You may be getting incorrect results because the filters in the where clause of your respective subqueries do not consider the group by columns i.e.

The table alias d helps us to distinguish between columns used in the outer/general query and the subqueries.

select 
 d.customer_id, 
 d.month, 
 count(d.transactions_id),
 (
     select count(transactions_id) 
     from DWH 
     where transaction_type = 1 and 
           customer_id = d.customer_id and
           month = d.month
 ) as transaction_type_1,
 (
     select count(transactions_id) 
     from DWH 
     where transaction_type = 2 and 
           customer_id = d.customer_id and
           month = d.month
 ) as transaction_type_2
FROM DWH d
GROUP BY d.customer_id, d.month

However, while this approach may work, it would be best if you tested this performance wise on your respective database and evaluate the cost metrics/query plan.

Another approach that may be performant uses case expressions to achieve the result and has been included below:

SELECT
     customer_id as Customer,
     month,
     COUNT(
         CASE WHEN transaction_type=1 THEN transactions_id END
     ) as transaction_type_1,
     COUNT(
         CASE WHEN transaction_type=2 THEN transactions_id END
     ) as transaction_type_2,
     COUNT(1) as Total_transactions
FROM
    DWH
GROUP BY
    customer_id, month
  • Related