join operation in SQL-CodePudding

There is such a task: By joining the tables HR.DEPARTMENTS and HR.EMPLOYEES, display complete data on departments in which the minimum salary is below 5000.

I tried to do this, but it gives an error

select distinct d.department_id,department_name, 
       d.manager_id, location_id 
from hr.departments d 
left join hr.employees e on e.department_id = d.department_id
where min(e.salary) < 5000 
order by 1

Error: group function is not allowed here

This is what hr.employees looks like:

EMPLOYEE_ID FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE JOB_ID SALARY COMMISSION_PCT MANAGER_ID DEPARTMENT_ID
    100 Steven King SKING 515.123.4567 17-JUN-03 AD_PRES 24000 - - 90

hr.departments:

DEPARTMENT_ID DEPARTMENT_NAME MANAGER_ID LOCATION_ID
10 Administration 200 1700

CodePudding user response：

You cannot use MIN in the WHERE clause, because MIN is an aggregation result over many rows, but in a WHERE clause you look at single rows (before any aggregation takes place).

The task to get the departments in question by joining the tables is a bit weird, because this is not how this should be done in SQL. If you must do it this way, then you only need a slight change to your query: Change the join into an inner join and check the rows' salary.

select distinct
  d.department_id, department_name, d.manager_id, location_id 
from hr.departments d 
join hr.employees e on e.department_id = d.department_id
where e.salary < 5000 
order by d.department_id;

The proper solution would use EXISTS or IN instead, so as not to create an unnecessarily large intermediate result that you must get rid of with DISTINCT:

select *
from hr.departments
where department_id in (select department_id from employees where salary < 5000)
order by department_id;

select *
from hr.departments d
where exists
(
  select null
  from employees e
  where e.salary < 5000
  and e.department_id = d.department_id
)
order by department_id;

CodePudding user response：

This works for your solution, where is use for row filtering like gender = 'Male' while having is for aggregating filtering functions like min(salary) < 5000 but for having you need to group by with something like department.

SELECT 
    *
FROM 
    DimEmployee
WHERE
    EmployeeID IN (
        SELECT 
            EmployeeID
        FROM
            DimEmployee
        GROUP BY
            EmployeeID
        HAVING  
            MIN(Salary) < 5000
    )

CodePudding user response：

First of all, don't use distinct, unless you have to. Secondly, you can't use group functions like that.

In order to solve this, you need to break the task into steps, breaking down your sentences.

"...the tables"

So we have this:

SELECT * FROM hr.departments;

... and ...

SELECT * FROM hr.employees;

"HR.DEPARTMENTS and HR.EMPLOYEES"

As you pointed our, the FK is the department.

(we first test the join, then add what we need) (the 1 is just a placeholder; you can use EMPLOYEE_ID or COUNT(1), it's irrelevant)

 SELECT 1 
 FROM hr.employees e
 LEFT JOIN hr.departments d on e.department_id = d.department_id;

"display complete data on departments"

Well, this is simple, you just enumerate the columns you need or use d.*. We'll do this later.

"which the minimum salary is below 5000"

Now we get to the blocking issue. Let's list the records.

 SELECT d.*
 FROM hr.employees e, hr.departments d 
 WHERE e.department_id = d.department_id
 AND EXISTS (SELECT 1 FROM hr.employees m WHERE m.department_id = d.department_id GROUP BY m.department_id HAVING min(m.salary) < 5000);

But what's this? We get a line for every employee of that department. Well, we can either use DISTINCT, but that is bad practice or we can fix the query.

We'll just remove the employees from the join.

 SELECT d.*
 FROM hr.departments d 
 WHERE EXISTS (SELECT 1 FROM hr.employees e WHERE e.department_id = d.department_id GROUP BY e.department_id HAVING min(e.salary) < 5000);

UPDATE:

To respect the task "By joining the tables"

So we have this:

SELECT d.*
FROM hr.departments d,
(
    SELECT e.department_id
    FROM   hr.employees e
    GROUP  BY e.department_id
    HAVING min(salary) < 5000
) e
WHERE e.department_id = d.department_id;