Home > database >  How do I Improve T-SQL query performance for retrieving the most recent date?
How do I Improve T-SQL query performance for retrieving the most recent date?

Time:01-13

I have an employee table that contains the columns

employee_id, name, hire_date, termination_date, rehire_date, is_active

in SQL Server. I would like to retrieve the most recent date of hire, termination or rehire for each employee, but only if the employee is active.

The result should include the employee_id, name, and the most recent date. How can I achieve this with a single query?

I am able to do it using the below method:

SELECT 
    employee_id, name, MAX(date) as most_recent_date
FROM 
    (SELECT 
         employee_id, name, hire_date AS date 
     FROM 
         employee
     UNION
     SELECT 
         employee_id, name, termination_date 
     FROM 
         employee
     UNION
     SELECT 
         employee_id, name, rehire_date 
     FROM 
         employee) AS t
WHERE 
    employee_id IN (SELECT employee_id 
                    FROM employee 
                    WHERE is_active = 1)
GROUP BY 
    employee_id, name

This solution seems to work, but I am not sure if it's the most efficient way. I am also worried about the performance when the employee table is large.

Can anyone advise on a better and more efficient way to do this?

CodePudding user response:

you can try this.

SELECT employee_id, name, (SELECT Max(v) FROM (VALUES (hire_date), (termination_date),(rehire_date)) AS value(v))  as most_recent_date
                    FROM employee 
                    WHERE is_active = 1 

CodePudding user response:

Logically only one of those dates would be not null and thus it is a simple group by:

select employee_id, name, 
    max(coalesce(hire_date, termination_date, rehire_date)) as most_recent_date
from Employee
where is_Active = 1
group by employee_id, name;

But since design seems to be flawed already (why not a single column for those dates, with another for the type), we can't be sure if it is logical you could use:

select employee_id, name max(case 
  when 
  coalesce(hire_date,'00010101') > coalesce(termination_date,'00010101') and 
  coalesce(hire_date,'00010101') > coalesce(rehire_date,'00010101') then hire_date
  when 
  coalesce(termination_date,'00010101') > coalesce(rehire_date,'00010101') then termination_date
  else
  coalesce(rehire_date,'00010101')
  end)
from Employee 
where is_Active = 1
group by employee_id;

or one of the variations in other replies.

CodePudding user response:

This is a little cheesy but I think it should work.

SELECT        employee_id, name, 
    CASE 
        WHEN MAX(hire_date) > MAX(termination_date) AND 
            MAX(hire_date) > MAX(rehire_date) THEN MAX(hire_date) 
        WHEN MAX(termination_date) > MAX(hire_date) AND 
            MAX(termination_date) > MAX(rehire_date) THEN MAX(termination_date) 
        WHEN MAX(rehire_date) > MAX(hire_date) AND 
            MAX(rehire_date) > MAX(termination_date) THEN MAX(rehire_date) 
    END AS date
FROM            employee
WHERE        (is_active = 1)
GROUP BY employee_id, name

CodePudding user response:

In SQL Database and SQL Server 2022 we can use GREATEST. It will be something like this:

SELECT employee_id, name, GREATEST(hire_date,termination_date,rehire_date ) 
FROM employee
WHERE is_Active = 1

if you are not abel to use this function, then the old way:

SELECT employee_id, name,
       CASE 
            WHEN hire_date > ISNULL(termination_date, '1900-01-01') AND hire_date > ISNULL(rehire_date , '1900-01-01') THEN hire_date
            WHEN termination_date > ISNULL(hire_date, '1900-01-01') AND termination_date > ISNULL(rehire_date , '1900-01-01') THEN termination_date
            WHEN rehire_date > ISNULL(hire_date, '1900-01-01') AND rehire_date > ISNULL(termination_date , '1900-01-01') THEN rehire_date
       END
FROM employee
WHERE is_Active = 1

and maybe better looking handle for NULLs..

In both cases, it should be better in terms of IO.

  • Related