Home > Software design >  SQL Snowflake Min Date For Each Record & Group By Month
SQL Snowflake Min Date For Each Record & Group By Month

Time:07-11

I am using Snowflake to get some information from the data. I have a table in which the same employee (PRS_ID) can be included in many rows and I want to take the EARLIEST date (BGN_DATE) for each employee (PRS_ID). Next, I want to see how many employees I have each month.

Ideally if I could have in YYYY-MM format (YYYYMM will work too). What I was working with is below:

SELECT PRS_ID, MIN(BGN_DATE), MONTH(BGN_DATE)
FROM myTable
WHERE EMP_STS = 'T' 
GROUP BY PRS_ID, MONTH(BGN_DATE)

However, this will show me the data per employee (PRS_ID), which is too granular (as explained above). But when I remove "PRS_ID" from grouping I get below error:

SQL compilation error: error line 1 at position 7 'myTable.PRS_ID' in select clause is neither an aggregate nor in the group by clause.

Does anyone know how to fix it?

Thank you

Sample Data:

PRS_ID EMP_STS BGN_DATE
homsimps T 2022-01-30
homsimps T 2022-02-28
homsimps T 2022-03-30
bartsimps T 2022-01-30
bartsimps T 2022-02-28
bartsimps T 2022-03-31
lisasimps T 2022-04-30
lisasimps T 2022-05-31
lisasimps T 2022-06-30
lisasimps T 2022-07-30
margesimps T 2022-02-28
margesimps T 2022-03-30

Expected Outcome:

Period Count
2022-01 2
2022-02 1
2022-03 0
2022-04 1

CodePudding user response:

Using aggregation twice:

WITH cte AS (
   SELECT PRR_ID, MIN(BGN_DATE) AS min_bgn_date
   FROM my_table
   WHERE EMP_STS = 'T'
   GROUP BY PRS_ID
)
SELECT TO_VARCHAR(min_bgn_date, 'YYYYMM') AS month, COUNT(*) AS cnt
FROM cte
GROUP BY TO_VARCHAR(min_bgn_date, 'YYYYMM');
-- GROUP BY month

CodePudding user response:

There is a simpler function here, DATE_TRUNC will allow you to convert dates to months. You can then convert to the format you'd like.

WITH MY_CTE AS (
    SELECT
        PRS_ID,
        DATE_TRUNC(MONTH, MIN(BGN_DATE)) AS MONTH_START
    FROM
        myTable
    WHERE
        EMP_STS = 'T'
    GROUP BY 1
)
SELECT
    TO_VAR_CHAR(MONTH_START, 'yyyy-mm') AS PERIOD,
    COUNT(PRS_ID)
GROUP BY 1
  • Related