Home > front end >  Finding MAX values for each combination of two columns
Finding MAX values for each combination of two columns

Time:10-18

I've been struggling with this for quite a while now and still cannot find a suitable solution. My situation is that I have a table on SQL Server 2019 that looks like this called JOB:

EMPLID EMP_RCD EFFDT EFF_SEQ HR_STATUS Company
102030 0 2013-02-04 0 Active 100
102030 0 2013-02-04 1 Active 100
102030 0 2013-05-28 0 Active 100
102030 0 2013-05-28 1 Inactive 100
102030 1 2013-02-04 0 Active 516
102030 1 2013-05-28 0 Inactive 516
102030 2 2013-05-28 0 Active 100
... ... ... ... ... ...

I am trying to find the MAX EFFDT that has the MAX EFF_SEQ value for each EMPLID and EMP_RCD combination. For example EMPLID 102030 and EMP_RCD 0 should give me the MAX EFFDT of 2013-05-28 and the EFF_SEQ 1 and HR_STATUS Inactive. My end result should look like this:

EMPLID EMP_RCD EFFDT EFF_SEQ HR_STATUS Company
102030 0 2013-05-28 1 Inactive 100
102030 1 2013-05-28 0 Inactive 516
102030 2 2020-01-01 0 Active 516
102030 3 2013-09-16 0 Inactive 516
102030 4 2016-04-01 0 Inactive 516
102035 0 2020-08-18 0 Active 100
102037 0 2020-01-01 0 Active 100
102038 0 2020-01-01 0 Inactive 500
102038 1 2020-01-01 0 Inactive 500

Referencing the accepted answer here I have tried running my query like this:

WITH JOBS AS (
SELECT A.EMPLID, A.EMP_RCD, A.EFFDT, A.EFF_SEQ, HR_STATUS
FROM JOB A
INNER JOIN
    (SELECT EMPLID, EMP_RCD, MAX(EFFDT) AS MAX_EFFDT, MAX(EFF_SEQ) AS MAX_EFF_SEQ 
        FROM JOB 
        GROUP BY EMPLID, EMP_RCD) B
    ON A.EMPLID = B.EMPLID
    AND A.EMP_RCD = B.EMP_RCD
    AND (A.EFFDT = B.MAX_EFFDT
    OR A.EFF_SEQ = B.MAX_EFF_SEQ)
)

SELECT A.*, B.MAXEFFDT FROM JOB A
INNER JOIN (SELECT EMPLID, EMP_RCD, MAX(EFFDT) AS MAXEFFDT
FROM JOBS
GROUP BY EMPLID, EMP_RCD) AS B ON A.EMPLID=B.EMPLID AND A.EMP_RCD=B.EMP_RCD AND A.EFFDT=B.MAXEFFDT
ORDER BY A.EMPLID ASC, A.EMP_RCD ASC

But the output looks like this

EMPLID EMP_RCD EFFDT EFF_SEQ HR_STATUS Company
102030 0 2013-05-28 0 Active 100
102030 0 2013-05-28 1 Inactive 100
102030 1 2013-05-28 0 Inactive 516
102030 2 2020-01-01 0 Active 516
102030 3 2013-09-16 0 Inactive 516
102030 4 2016-04-01 0 Inactive 516
102035 0 2020-08-18 0 Active 100
102037 0 2020-01-01 0 Active 100
102038 0 2020-01-01 0 Inactive 500
102038 1 2020-01-01 0 Inactive 500

The first line in the result table shouldn't exist and everything else is correct. I am not sure what I'm missing as to why it would appear there. I've tried multiple other solutions and it always either resulted in the first line being printed or it wasn't even close to the expected output. Any and all assistance would be appreciated. I can share the full table if needed, it would've been too large to share in full here (only 65 rows though). Thanks!

CodePudding user response:

select  EMPLID  
       ,EMP_RCD 
       ,EFFDT   
       ,EFF_SEQ 
       ,HR_STATUS   
       ,Company
from   (
        select  *
                ,rank() over(partition by EMPLID, EMP_RCD order by EFFDT desc, EFF_SEQ desc) as rnk
        from    jobs
       ) jobs
where  rnk = 1
EMPLID EMP_RCD EFFDT EFF_SEQ HR_STATUS Company
102030 0 2013-05-28 1 Inactive 100
102030 1 2013-05-28 0 Inactive 516
102030 2 2013-05-28 0 Active 100

Fiddle

  • Related