Hope you are fine, I am trying to account the amount of observations that I have in an employee database. Tables look more or less like this:
Date_Table
date_dt |
---|
2020-09-07 |
2020-09-14 |
2020-09-21 |
Employee_table
login_id | effective_date | is_active |
---|---|---|
a | 2020-09-07 | 1 |
a | 2020-09-14 | 1 |
b | 2020-09-07 | 1 |
b | 2020-09-14 | 0 |
c | 2020-09-21 | 1 |
keep in mind the effective_date represents (the higher the date the most recent the change) some change (attrition, position change, what ever, those are easily filtered) being the latest the one the current status.
In the above example the date 2020-09-14 for empl_login b would be the day it stopped to be active within the table.
I want to reflect something like this:
the_date | amount_of_employees |
---|---|
2020-09-07 | 2 |
2020-09-14 | 1 |
2020-09-21 | 2 |
This query works perfectly fine, and provides me the correct number:
SELECT '2020-09-07',COUNT(DISTINCT login_id) amount_of_employees
FROM (SELECT date_dt FROM Date_Table) AS dd,(SELECT *,
ROW_NUMBER() OVER (PARTITION BY login_id ORDER BY effective_date DESC) AS chk
FROM Employee_table WHERE effective_date <= '2020-09-07' ) AS dp
WHERE
dp.is_active =1
AND
dp.chk=1
GROUP BY 1
ORDER BY 1 ASC;
Great! This one works and gives me the right value:
the_date | amount_of_employees |
---|---|
2020-09-07 | 2 |
However, when I try this to build my dataset with this query:
SELECT dd.date_dt ,COUNT(DISTINCT login_id) amount_of_employees
FROM (SELECT date_dt FROM Date_Table) AS dd,(SELECT *,
ROW_NUMBER() OVER (PARTITION BY login_id ORDER BY effective_date DESC) AS chk
FROM Employee_table WHERE effective_date <= dd.date_dt ) AS dp
WHERE
dp.is_active =1
AND
dp.chk=1
GROUP BY 1
ORDER BY 1 ASC;
I get this error message:
Invalid operation: subquery in FROM may not refer to other relations of same query level
I tried to investigate something like this:
but didn't work or doesn't apply necessarily. May be I am not getting it
Any idea? I wouldn't like to make A lot of unions, but is a workaround.
Thanks in advance
CodePudding user response:
I'm not familiar with Amazon Redshift,but as long as your query syntax is supported, you can use a subquery to get the count, and there you'll be able to refer to the columns of the outer query like this
SELECT
dt.date_dt,
(
SELECT COUNT(DISTINCT login_id)
FROM (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY login_id ORDER BY effective_date DESC) AS rn
FROM employee_table et
WHERE et.effective_date <= dt.date_dt
ORDER BY effective_date DESC
) t
WHERE rn = 1 AND is_active = 1
) amount
FROM date_table dt
CodePudding user response:
this is a solution for this:
SELECT dt.date_dt, COUNT(DISTINCT login_id) other_account
FROM Date_Table dt
LEFT JOIN employee_table et ON dd.date_dt BETWEEN et.effective_date AND et.effective_date (some additional interval)
WHERE et.is_active = 1 (And other where clauses)
GROUP BY 1
Thanks for all your support