Below is the table I have created and inserted values in it:
CREATE TABLE employees_list
(employeeID int identity(1,1),
employeeName varchar(25))
GO
INSERT INTO employees_list VALUES ('Kevin'),('Charles')
GO
CREATE TABLE hourlyRates
(employeeID int,
rate int,
rateDate date)
INSERT INTO hourlyRates VALUES (1, 28, '2016-01-01'),
(1, 39, '2016-02-01'),
(2, 43, '2016-01-01'),
(2, 57, '2016-02-01')
CREATE TABLE workingHours
(employeeID int,
startdate datetime,
enddate datetime)
GO
INSERT INTO workingHours VALUES (1, '2016-01-01 09:00', '2016-01-01 17:00'),
(1, '2016-01-02 09:00', '2016-01-02 17:00'),
(1, '2016-02-01 10:00', '2016-02-01 16:00'),
(1, '2016-02-02 11:00', '2016-02-02 13:00'),
(2, '2016-01-01 10:00', '2016-01-01 16:00'),
(2, '2016-01-02 08:00', '2016-01-02 14:00'),
(2, '2016-02-01 14:00', '2016-02-01 19:00'),
(2, '2016-02-02 13:00', '2016-02-02 16:00')
GO
SELECT * FROM employees_list
SELECT * FROM hourlyRates
SELECT * FROM workingHours
Now the question is:
Display employee ID, name, start date, end date, hours worked and hourly rate for the Employee whose ID number is 1.
This is what I have done:
SELECT workingHours.employeeID,employeeName,startdate,enddate,
DATEDIFF(HOUR,startdate,enddate) AS 'Hours Worked',
rate AS 'Hourly Rate'
FROM hourlyRates,workingHours,employees_list
WHERE hourlyRates.employeeID = workingHours.employeeID
AND employees_list.employeeID = workingHours.employeeID
AND workingHours.employeeID = 1
And I got the following result:
The problem with the result above is that the result is being repeated or duplicated from row number 5 to row number 8. It is supposed to generate the first 4 rows if I'm not mistaken.
I even tried adding DISTINCT in the SELECT statement and still it is showing duplicated result.
What change is needed on my query to eliminate the duplication?
CodePudding user response:
The problem is you're joining the tables only on the employeeID
. With that, you will get a row for every combination of hourlyRates
and workingHours
- which is 8 in this case. You'd have to join the tables somehow on the dates as well. You want to take a rate from hourlyRates
only when it's in the correct month. One of the ways to do that would be:
SELECT workingHours.employeeID,employeeName,startdate,enddate,
DATEDIFF(HOUR,startdate,enddate) AS 'Hours Worked',
rate AS 'Hourly Rate'
FROM hourlyRates,workingHours,employees_list
WHERE hourlyRates.employeeID = workingHours.employeeID
AND employees_list.employeeID = workingHours.employeeID
AND (hourlyRates.rateDate
BETWEEN
DATEFROMPARTS(DATEPART(YEAR, workingHours.startDate), DATEPART(MONTH,workingHours.startDate), 1)
AND DATEFROMPARTS(DATEPART(YEAR, workingHours.endDate), DATEPART(MONTH,workingHours.endDate), 1))
AND workingHours.employeeID = 1
The new join condition is taking year and month parts with the DATEPART function, from workingHours
's dates, and making sure that hourlyRates.rateDate
is between those two.
DATEFROMPARTS function takes year, month and day integers (for the start of the month, we take that day is 1) and converts it to DATE
.
A better way of doing the same thing would be using a newer join syntax:
SELECT el.employeeID
,el.employeeName
,wh.startdate
,wh.enddate
,DATEDIFF(HOUR,wh.startdate,wh.enddate) AS [Hours Worked]
,hr.rate AS [Hourly Rate]
FROM employees_list el
JOIN hourlyRates hr ON el.employeeID = hr.employeeID
JOIN workingHours wh ON hr.employeeID = wh.EmployeeID
AND (hr.rateDate
BETWEEN
DATEFROMPARTS(DATEPART(YEAR, wh.startDate), DATEPART(MONTH,wh.startDate), 1)
AND DATEFROMPARTS(DATEPART(YEAR, wh.endDate), DATEPART(MONTH,wh.endDate), 1)
)
WHERE wh.employeeID = 1
Note that neither of these queries are Sargable because of the function calls in the join condition. These queries should be avoided when possible, as pointed out by @SOS