I am using Snowflake to get some information from the data. I have a table in which the same employee (PRS_ID) can be included in many rows and I want to take the EARLIEST date (BGN_DATE) for each employee (PRS_ID). Next, I want to see how many employees I have each month.
Ideally if I could have in YYYY-MM format (YYYYMM will work too). What I was working with is below:
SELECT PRS_ID, MIN(BGN_DATE), MONTH(BGN_DATE)
FROM myTable
WHERE EMP_STS = 'T'
GROUP BY PRS_ID, MONTH(BGN_DATE)
However, this will show me the data per employee (PRS_ID), which is too granular (as explained above). But when I remove "PRS_ID" from grouping I get below error:
SQL compilation error: error line 1 at position 7 'myTable.PRS_ID' in select clause is neither an aggregate nor in the group by clause.
Does anyone know how to fix it?
Thank you
Sample Data:
PRS_ID | EMP_STS | BGN_DATE |
---|---|---|
homsimps | T | 2022-01-30 |
homsimps | T | 2022-02-28 |
homsimps | T | 2022-03-30 |
bartsimps | T | 2022-01-30 |
bartsimps | T | 2022-02-28 |
bartsimps | T | 2022-03-31 |
lisasimps | T | 2022-04-30 |
lisasimps | T | 2022-05-31 |
lisasimps | T | 2022-06-30 |
lisasimps | T | 2022-07-30 |
margesimps | T | 2022-02-28 |
margesimps | T | 2022-03-30 |
Expected Outcome:
Period | Count |
---|---|
2022-01 | 2 |
2022-02 | 1 |
2022-03 | 0 |
2022-04 | 1 |
CodePudding user response:
Using aggregation twice:
WITH cte AS (
SELECT PRR_ID, MIN(BGN_DATE) AS min_bgn_date
FROM my_table
WHERE EMP_STS = 'T'
GROUP BY PRS_ID
)
SELECT TO_VARCHAR(min_bgn_date, 'YYYYMM') AS month, COUNT(*) AS cnt
FROM cte
GROUP BY TO_VARCHAR(min_bgn_date, 'YYYYMM');
-- GROUP BY month
CodePudding user response:
There is a simpler function here, DATE_TRUNC
will allow you to convert dates to months. You can then convert to the format you'd like.
WITH MY_CTE AS (
SELECT
PRS_ID,
DATE_TRUNC(MONTH, MIN(BGN_DATE)) AS MONTH_START
FROM
myTable
WHERE
EMP_STS = 'T'
GROUP BY 1
)
SELECT
TO_VAR_CHAR(MONTH_START, 'yyyy-mm') AS PERIOD,
COUNT(PRS_ID)
GROUP BY 1