Home > Back-end >  Aggregating three tables but getting wrong values during the aggregation operation
Aggregating three tables but getting wrong values during the aggregation operation

Time:07-18

"employee" Table

emp_id empName
1 ABC
2 xyx

"client" Table:

id emp_id clientName
1 1 a
2 1 b
3 1 c
4 2 d

"collection" Table

id emp_id Amount
1 2 1000
2 1 2000
3 1 1000
4 1 1200

I want to aggregate values from the three tables input tables here reported as samples. For each employee I need to find

  • the total collection amount for that employee (as a sum)
  • the clients that are involved with the corresponding employee (as a comma-separated value)

Here follows my current query.

MyQuery:

SELECT emp_id, 
       empName, 
       GROUP_CONCAT(client.clientName ORDER BY client.id SEPARATOR '') AS clientName, 
       SUM(collection.Amount) 
FROM      employee
LEFT JOIN client
       ON clent.emp_id = employee.emp_id
LEFT JOIN collection
       ON collection.emp_id = employee.emp_id
GROUP BY employee.emp_id;  

The problem of this query is that I'm getting wrong values of sums and clients when an employee is associated to multiple of them.

Current Output:

emp_id empName clientName TotalCollection
1 ABC a,b,c,c,b,a,a,b,c 8400
2 xyz d,d 1000

Expected Output:

emp_id empName clientName TotalCollection
1 ABC a , b , c 4200
2 xyz d 1000

How can I solve this problem?

CodePudding user response:

There are some typos in your query:

  • the separator inside the GROUP_CONCAT function should be a comma instead of a space, given your current output, though comma is default value, so you can really omit that clause.
  • each alias in your select requires the table where it comes from, as long as those field names are used in more than one tables among the ones you're joining on
  • your GROUP BY clause should at least contain every field that is not aggregated inside the SELECT clause in order to have a potentially correct output.

The overall conceptual problem in your query is that the join combines every row of the "employee" table with every row of the "client" table (resulting in multiple rows and higher sum of amounts during the aggregation). One way for getting out of the rabbit hole is a first aggregation on the "client" table (to have one row for each "emp_id" value), then join back with the other tables.

SELECT emp.emp_id, 
       emp.empName, 
       cl.clientName, 
       SUM(coll.Amount)
FROM      employee emp
LEFT JOIN (SELECT emp_id, 
                  GROUP_CONCAT(client.clientName 
                               ORDER BY client.id) AS clientName 
           FROM client
           GROUP BY emp_id) cl
       ON cl.emp_id = emp.emp_id
LEFT JOIN (SELECT emp_id, Amount FROM collection) coll
       ON coll.emp_id = emp.emp_id
GROUP BY emp.emp_id, 
         emp.empName, 
         cl.clientName

Check the demo here.

CodePudding user response:

Regardless of my comment, here is a query for your desired output:

SELECT 
a.emp_id,
 a.empName,
 a.clientName,
SUM(col.Amount)  AS totalCollection
FROM (SELECT e.emp_id, 
       e.`empName`, 
       GROUP_CONCAT(DISTINCT c.clientName ORDER BY c.id ) AS clientName
FROM      employee e
LEFT JOIN `client` c
       ON c.emp_id = e.emp_id
       GROUP BY e.`emp_id`) a
 LEFT JOIN collection col
       ON col.emp_id = a.emp_id
GROUP BY col.emp_id;  

When having multiple joins, you should be careful about the relations and the number of results(rows) that your query generates. You might as well have multiple records in output than your desired ones. Hope this helps

CodePudding user response:

SELECT emp_id, 
       empName, 
       GROUP_CONCAT(client.clientName ORDER BY client.id SEPARATOR '') AS clientName, 
       C .Amount 
FROM      employee
LEFT JOIN client
       ON clent.emp_id = employee.emp_id
LEFT JOIN (select collection.emp_id , sum(collection.Amount ) as Amount  from collection group by collection.emp_id) C
       ON C.emp_id = employee.emp_id
GROUP BY employee.emp_id;

it works for me now

  • Related