Home > Software design >  Query 2 COUNTS with a GROUP BY statement trouble
Query 2 COUNTS with a GROUP BY statement trouble

Time:05-27

I am trying to query a count from registered users and enrolled users for each group in the table.

However, I keep getting duplicates of the Group names, registered, and enrolled. I need it to be just the unique group name with the count in its respective column. I am a novice, and I've been wrapping my head around this for 3 days. Any assistance would be greatly appreciated. below is what I am closest with, but still getting duplicate group names.

WITH CTE_GROUP as (
    SELECT DISTINCT [GROUP] AS GROUPS
    FROM TABLE1
    WHERE AUTH_PAGE = 'X' AND CAST(CreationDate AS DATE) >= '2022-04-01' AND CAST(CreationDate AS DATE) < '2022-05-01' AND EMAIL IS NOT NULL AND EMAIL <> '' AND EMAIL NOT LIKE '%@company.com' OR CAST(CreationDate AS DATE) >= '2022-04-01' AND CAST(CreationDate AS DATE) < '2022-05-01' AND EMAIL IS NOT NULL AND EMAIL <> '' AND EMAIL NOT LIKE '%@company.com'
    GROUP BY [GROUP]
),
CTE_REGISTERED as (
    SELECT [GROUP], COUNT (*) AS REGISTERED
    FROM TABLE1
    WHERE CAST(CreationDate AS DATE) >= '2022-04-01' AND CAST(CreationDate AS DATE) < '2022-05-01' AND EMAIL IS NOT NULL AND EMAIL <> '' AND EMAIL NOT LIKE '%@company.com'
    GROUP BY [GROUP]
),
CTE_ENROLLED as (
    SELECT [GROUP], COUNT (*) AS ENROLLED
    FROM TABLE1
    WHERE AUTH_PAGE = 'X' AND CAST(CreationDate AS DATE) >= '2022-04-01' AND CAST(CreationDate AS DATE) < '2022-05-01' AND EMAIL IS NOT NULL AND EMAIL <> '' AND EMAIL NOT LIKE '%@company.com'
    GROUP BY [GROUP]
)
SELECT DISTINCT GROUPS, REGISTERED, ENROLLED
FROM CTE_GROUP, CTE_REGISTERED, CTE_ENROLLED

My results come out to this.

GROUPS               REGISTERED       ENROLLED
CompanyA              3                 2
CompanyB              3                 3
CompanyA              3                 2
CompanyB              3                 3

The result I am looking for should be

GROUPS               REGISTERED       ENROLLED
CompanyA              3                 2
CompanyB              3                 3

Thanks

CodePudding user response:

I believe you need a "conditional aggregate" in a single query:

SELECT
      [group]
    , count(*) AS registered
    , count(CASE WHEN AUTH_PAGE = 'X' THEN 1 END) AS enrolled
FROM table1
WHERE CreationDate >= '20220401'
    AND CreationDate < '20220501'
    AND EMAIL NOT LIKE '%@company.com'
GROUP BY
     [group]
;

In essence you use a case expression inside an aggregation function, such as count() and this way your count inherits the "condition(s)" defined through that case expression.

Also note you do NOT need to cast those [CreationDate] to "date" for the where clause predicates to work. All date/time related data types can be compared to date literals. Furthermore, in SQL Server, the "safest" date literal is just in YYYYMMDD format.

CodePudding user response:

Your resultset is basically a CROSS JOIN, because you do not have JOIN conditions between the three CTEs.

Try this:

SELECT 
    * 
FROM 
    CTE_GROUP AS [Groups] 
    INNER JOIN CTE_Registered AS [Registered] ON [Groups].[GROUPS] = [Registered].[GROUP]
    INNER JOIN CTE_ENROLLED AS [Enrolled] ON [Groups].[GROUPS] = [Enrolled].[GROUP]

In general, when you are dealing with data from multiple tables (whether they are actual physical tables, views, or in-memory tables), you will need to let the database engine know how this data is related. You do so by creating relationships (using primary and foreign keys), and by instructing the engine in the query using the JOIN conditions. Tables can be related through more than one field, in which case you can string them together in the JOIN condition just like a WHERE clause.

Be aware that there are multiple ways to perform a JOIN, and the outcome will depend on whether or not you use the correct type. In the example query I provided, I'm using an INNER JOIN, which ensures you only get results returned that exist in both tables from included in the JOIN.

  • Related