I've been struggling with this for quite a while now and still cannot find a suitable solution. My situation is that I have a table on SQL Server 2019 that looks like this called JOB:
EMPLID | EMP_RCD | EFFDT | EFF_SEQ | HR_STATUS | Company |
---|---|---|---|---|---|
102030 | 0 | 2013-02-04 | 0 | Active | 100 |
102030 | 0 | 2013-02-04 | 1 | Active | 100 |
102030 | 0 | 2013-05-28 | 0 | Active | 100 |
102030 | 0 | 2013-05-28 | 1 | Inactive | 100 |
102030 | 1 | 2013-02-04 | 0 | Active | 516 |
102030 | 1 | 2013-05-28 | 0 | Inactive | 516 |
102030 | 2 | 2013-05-28 | 0 | Active | 100 |
... | ... | ... | ... | ... | ... |
I am trying to find the MAX EFFDT that has the MAX EFF_SEQ value for each EMPLID and EMP_RCD combination. For example EMPLID 102030 and EMP_RCD 0 should give me the MAX EFFDT of 2013-05-28 and the EFF_SEQ 1 and HR_STATUS Inactive. My end result should look like this:
EMPLID | EMP_RCD | EFFDT | EFF_SEQ | HR_STATUS | Company |
---|---|---|---|---|---|
102030 | 0 | 2013-05-28 | 1 | Inactive | 100 |
102030 | 1 | 2013-05-28 | 0 | Inactive | 516 |
102030 | 2 | 2020-01-01 | 0 | Active | 516 |
102030 | 3 | 2013-09-16 | 0 | Inactive | 516 |
102030 | 4 | 2016-04-01 | 0 | Inactive | 516 |
102035 | 0 | 2020-08-18 | 0 | Active | 100 |
102037 | 0 | 2020-01-01 | 0 | Active | 100 |
102038 | 0 | 2020-01-01 | 0 | Inactive | 500 |
102038 | 1 | 2020-01-01 | 0 | Inactive | 500 |
Referencing the accepted answer here I have tried running my query like this:
WITH JOBS AS (
SELECT A.EMPLID, A.EMP_RCD, A.EFFDT, A.EFF_SEQ, HR_STATUS
FROM JOB A
INNER JOIN
(SELECT EMPLID, EMP_RCD, MAX(EFFDT) AS MAX_EFFDT, MAX(EFF_SEQ) AS MAX_EFF_SEQ
FROM JOB
GROUP BY EMPLID, EMP_RCD) B
ON A.EMPLID = B.EMPLID
AND A.EMP_RCD = B.EMP_RCD
AND (A.EFFDT = B.MAX_EFFDT
OR A.EFF_SEQ = B.MAX_EFF_SEQ)
)
SELECT A.*, B.MAXEFFDT FROM JOB A
INNER JOIN (SELECT EMPLID, EMP_RCD, MAX(EFFDT) AS MAXEFFDT
FROM JOBS
GROUP BY EMPLID, EMP_RCD) AS B ON A.EMPLID=B.EMPLID AND A.EMP_RCD=B.EMP_RCD AND A.EFFDT=B.MAXEFFDT
ORDER BY A.EMPLID ASC, A.EMP_RCD ASC
But the output looks like this
EMPLID | EMP_RCD | EFFDT | EFF_SEQ | HR_STATUS | Company |
---|---|---|---|---|---|
102030 | 0 | 2013-05-28 | 0 | Active | 100 |
102030 | 0 | 2013-05-28 | 1 | Inactive | 100 |
102030 | 1 | 2013-05-28 | 0 | Inactive | 516 |
102030 | 2 | 2020-01-01 | 0 | Active | 516 |
102030 | 3 | 2013-09-16 | 0 | Inactive | 516 |
102030 | 4 | 2016-04-01 | 0 | Inactive | 516 |
102035 | 0 | 2020-08-18 | 0 | Active | 100 |
102037 | 0 | 2020-01-01 | 0 | Active | 100 |
102038 | 0 | 2020-01-01 | 0 | Inactive | 500 |
102038 | 1 | 2020-01-01 | 0 | Inactive | 500 |
The first line in the result table shouldn't exist and everything else is correct. I am not sure what I'm missing as to why it would appear there. I've tried multiple other solutions and it always either resulted in the first line being printed or it wasn't even close to the expected output. Any and all assistance would be appreciated. I can share the full table if needed, it would've been too large to share in full here (only 65 rows though). Thanks!
CodePudding user response:
select EMPLID
,EMP_RCD
,EFFDT
,EFF_SEQ
,HR_STATUS
,Company
from (
select *
,rank() over(partition by EMPLID, EMP_RCD order by EFFDT desc, EFF_SEQ desc) as rnk
from jobs
) jobs
where rnk = 1
EMPLID | EMP_RCD | EFFDT | EFF_SEQ | HR_STATUS | Company |
---|---|---|---|---|---|
102030 | 0 | 2013-05-28 | 1 | Inactive | 100 |
102030 | 1 | 2013-05-28 | 0 | Inactive | 516 |
102030 | 2 | 2013-05-28 | 0 | Active | 100 |