I have two tables , I want to fetch correct data rows, but when I make a join in between incorrect extra rows are getting create. I want to exclude them.
Table T1
Employee_ID | Work_START |
---|---|
1111 | 10 Nov |
1111 | 14 Nov |
1111 | 18 Nov |
Table T2
Employee_ID | Work_END |
---|---|
1111 | 12 Nov |
1111 | 15 Nov |
1111 | 20 Nov |
I can not use Rank Function since MySQL version is 5.6 and I have Read access to DB so can not create INDEX or use SET function
I tried to make a join with Below SQL Query:
Select T1.Employee_ID, T1.Work_START, T2.Work_END from T1
Left Join T2 On T1.Employee_ID = T2.Employee_ID
where T2.Work_END > T1.Work_START
(used this condition to reduce the incorrect joined rows)
I tried using Left , Right join, using Distinct function as well
I am getting the result as below
Order_ID | Order_Date | Ship_Date |
---|---|---|
1111 | 10 Nov | 12 Nov |
1111 | 10 Nov | 15 Nov |
1111 | 10 Nov | 20 Nov |
1111 | 14 Nov | 15 Nov |
1111 | 14 Nov | 20 Nov |
1111 | 18 Nov | 20 Nov |
Expected Result is as below
Logic of Output : an employee has worked on a task on 3 different times, so to get those correct 3 rows I want the expected table to follow the below condition
row 1 work_END should be higher than row 1 Work_START and 2nd row Work_START should be higher than 1st row Work_END and so on next row Work_start should be higher than previos row Work_END
Expected Table
Order_ID | Order_Date | Ship_Date |
---|---|---|
1111 | 10 Nov | 12 Nov |
1111 | 14 Nov | 15 Nov |
1111 | 18 Nov | 20 Nov |
Please Note: I Have read access to DB and Can not use Rank function since MySQL version is 5.6
CodePudding user response:
Disclaimer : you should consider fixing your data model. Spreading that data over two different tables does not look like appropriate design.
With the current data model, we could approach the question like so: starting from each beginning date in the first table, bring the closest end date with a subquery:
select t1.employee_id, t1.work_start,
(
select min(t2.work_end)
from t2
where t2.employee_id = t1.employee_id and t2.work_end > t1.work_start
) work_end
from t1
This guarantees as many rows in the resultset as there are in the first table (not more, not less). If the dates of the two tables do not properly interleave, you might see results that look inconsistent somehow (using row_number()
would not avoid this).