I can find the duplicate Employee No and then identify of the 2 (or more) rows which fields differ but need a query which tell me the fields that differ.
i.e.
ID | EmployeeNo | Name | Job | Salary | Country
1 | 1234 | Fred | Analyst | 50,000 | UK
2 | 1234 | Fred | Engineer | 50,000 | UK
3 | 2345 | John | Typist | 25,000 | UK
4 | 5678 | Rose | Florist | 10,000 | UK
5 | 5678 | Rose | Actress | 60,000 | UK
6 | 9999 | Sid | Driver | 10,000 | UK
7 | 9999 | Sid | Driver | 20,000 | Germany
8 | 9999 | Sid | Driver | 20,000 | UK
So, I'm trying to develop a query that tells me I have duplicate EmployeeNo records and what fields differ
i.e. 1234 occurs twice and the Job field differs
5678 occurs twice and the Job field and the Salary field differ
9999 occurs 3 times and the Salary and Country fields have differences
My table is much wider and longer than this, so looking for the differences is not that easy, having to scan across the adjacent rows to spot the fields that differ
CodePudding user response:
Use a self-join to find the rows with the same employee number, then use conditions like IF(e1.job != e2.join, 'Jobs', NULL)
to determine which fields differ.
SELECT e1.EmployeeNo, e1.Name, e1.id, e2.id,
CONCAT_WS(', ',
IF(e1.job != e2.job, 'Job', NULL),
IF(e1.salary != e2.salary, 'Salary', NULL),
-- repeat above for all other columns
) AS differences
FROM employee AS e1
JOIN employee AS e2 ON e1.EmployeeNo = e2.EmployeeNo AND e1.id < e2.id
HAVING differences != ''