How to avoid duplication while trying to display employee work details consisting of start & end dat-CodePudding

Below is the table I have created and inserted values in it:

CREATE TABLE employees_list   
(employeeID int identity(1,1),   
employeeName varchar(25))  
GO   
  
INSERT INTO employees_list VALUES ('Kevin'),('Charles')  
GO   
   
CREATE TABLE hourlyRates   
(employeeID int,   
rate int,   
rateDate date)   
  
INSERT INTO hourlyRates VALUES (1, 28, '2016-01-01'),   
                               (1, 39, '2016-02-01'),  
                               (2, 43, '2016-01-01'),  
                               (2, 57, '2016-02-01')  
  
  
CREATE TABLE workingHours   
(employeeID int,   
startdate datetime,   
enddate datetime)  
GO   
  
INSERT INTO workingHours VALUES (1, '2016-01-01 09:00', '2016-01-01 17:00'),  
                                (1, '2016-01-02 09:00', '2016-01-02 17:00'),  
                                (1, '2016-02-01 10:00', '2016-02-01 16:00'),  
                                (1, '2016-02-02 11:00', '2016-02-02 13:00'),  
                                (2, '2016-01-01 10:00', '2016-01-01 16:00'),  
                                (2, '2016-01-02 08:00', '2016-01-02 14:00'),  
                                (2, '2016-02-01 14:00', '2016-02-01 19:00'),  
                                (2, '2016-02-02 13:00', '2016-02-02 16:00')  
GO

SELECT * FROM employees_list
SELECT * FROM hourlyRates
SELECT * FROM workingHours

Now the question is:

Display employee ID, name, start date, end date, hours worked and hourly rate for the Employee whose ID number is 1.

This is what I have done:

SELECT workingHours.employeeID,employeeName,startdate,enddate,
DATEDIFF(HOUR,startdate,enddate) AS 'Hours Worked',
rate AS 'Hourly Rate'
FROM hourlyRates,workingHours,employees_list
WHERE hourlyRates.employeeID = workingHours.employeeID
AND employees_list.employeeID = workingHours.employeeID
AND workingHours.employeeID = 1

And I got the following result:

The problem with the result above is that the result is being repeated or duplicated from row number 5 to row number 8. It is supposed to generate the first 4 rows if I'm not mistaken.

I even tried adding DISTINCT in the SELECT statement and still it is showing duplicated result.

What change is needed on my query to eliminate the duplication?

CodePudding user response：

The problem is you're joining the tables only on the employeeID. With that, you will get a row for every combination of hourlyRates and workingHours - which is 8 in this case. You'd have to join the tables somehow on the dates as well. You want to take a rate from hourlyRates only when it's in the correct month. One of the ways to do that would be:

SELECT workingHours.employeeID,employeeName,startdate,enddate,
DATEDIFF(HOUR,startdate,enddate) AS 'Hours Worked',
rate AS 'Hourly Rate'
FROM hourlyRates,workingHours,employees_list
WHERE hourlyRates.employeeID = workingHours.employeeID
AND employees_list.employeeID = workingHours.employeeID
AND (hourlyRates.rateDate
    BETWEEN 
    DATEFROMPARTS(DATEPART(YEAR, workingHours.startDate), DATEPART(MONTH,workingHours.startDate), 1) 
    AND DATEFROMPARTS(DATEPART(YEAR, workingHours.endDate), DATEPART(MONTH,workingHours.endDate), 1)) 
AND workingHours.employeeID = 1

The new join condition is taking year and month parts with the DATEPART function, from workingHours's dates, and making sure that hourlyRates.rateDate is between those two.

DATEFROMPARTS function takes year, month and day integers (for the start of the month, we take that day is 1) and converts it to DATE.

A better way of doing the same thing would be using a newer join syntax:

SELECT el.employeeID
       ,el.employeeName
       ,wh.startdate
       ,wh.enddate
       ,DATEDIFF(HOUR,wh.startdate,wh.enddate) AS [Hours Worked]
       ,hr.rate AS [Hourly Rate]
FROM employees_list el 
    JOIN hourlyRates hr ON el.employeeID = hr.employeeID
    JOIN workingHours wh ON hr.employeeID = wh.EmployeeID 
        AND (hr.rateDate
            BETWEEN 
            DATEFROMPARTS(DATEPART(YEAR, wh.startDate), DATEPART(MONTH,wh.startDate), 1) 
            AND DATEFROMPARTS(DATEPART(YEAR, wh.endDate), DATEPART(MONTH,wh.endDate), 1)
        ) 
WHERE wh.employeeID = 1

Note that neither of these queries are Sargable because of the function calls in the join condition. These queries should be avoided when possible, as pointed out by @SOS