I am getting familiar with pl\sql and I have a question about certain task.
I want to find the job with the lowest average salary, this is the solution from the script:
SELECT job_id, AVG(salary)
FROM employees
GROUP BY job_id
HAVING AVG(salary) = (SELECT MIN(AVG(salary))
FROM employees
GROUP BY job_id);
and the answer is job_id -> PU_CLERK, AVG(salary) -> 2700
My first question is why this code won't work:
SELECT job_id, MIN(AVG(salary))
FROM employees
GROUP BY job_id
I get an "not a single-group group function" error. By googeling the error, almost every solution is: To resolve the error, you can either remove the group function or column expression from the SELECT clause or you can add a GROUP BY clause that includes the column expressions.
Then why isn't this working? I have column expression in group by clause. Someone already posted about this but everyone just gave different solution. I would really appreciate if someone can actually explain.
With this query:
SELECT job_id, AVG(salary)
FROM employees
GROUP BY job_id
ORDER BY(salary)
I get a list of average salaries and PU_CLERK is the lowest with 2700, just like in the solution. But, if a limit a number of rows displayed with ROWNUM, i get different solution. So, this query:
SELECT job_id, AVG(salary)
FROM employees
WHERE ROWNUM = 1
GROUP BY job_id
ORDER BY(salary)
gives solution of SH_CLERK and 2600. Why is that? I tried working with FETCH but that just didn't work.
Also, this query:
SELECT MIN(AVG(salary))
FROM employees
GROUP BY job_id
gives solution of 2700, but it is missing a job_id so I can know which job position that actually is.
Thanks in advance to everyone for the answers and comments!
CodePudding user response:
It is not working as you are effectively doing:
SELECT job_id,
MIN(avg_sal)
FROM (
SELECT job_id,
AVG(salary) AS avg_sal
FROM employees
GROUP BY job_id
)
And, while there is a GROUP BY
clause on the inner sub-query there is not one on the outer query and the job_id
column is not aggregated.
If you want to aggregate the outer query then:
SELECT MIN(job_id) KEEP (DENSE_RANK FIRST ORDER BY avg_sal),
MIN(avg_sal)
FROM (
SELECT job_id,
AVG(salary) AS avg_sal
FROM employees
GROUP BY job_id
)
or, without the sub-query:
SELECT MIN(MIN(job_id)) KEEP (DENSE_RANK FIRST ORDER BY AVG(salary)) AS job_id,
MIN(AVG(salary))
FROM employees
GROUP BY job_id
In which, everything is aggregated twice.
Or, from Oracle 12, you can use FETCH FIRST ROW ONLY
and order by the AVG(salary)
:
SELECT job_id, AVG(salary)
FROM employees
GROUP BY job_id
ORDER BY AVG(salary)
FETCH FIRST ROW ONLY;
or you can use ROWNUM
with a nested sub-query:
SELECT *
FROM (
SELECT job_id, AVG(salary)
FROM employees
GROUP BY job_id
ORDER BY AVG(salary)
)
WHERE ROWNUM = 1;
db<>fiddle here
CodePudding user response:
This is described in the documentation:
You can nest aggregate functions. For example, the following example calculates the average of the maximum salaries of all the departments in the sample schema hr:
SELECT AVG(MAX(salary)) FROM employees GROUP BY department_id; AVG(MAX(SALARY)) ---------------- 10926.3333
This calculation evaluates the inner aggregate (MAX(salary)) for each group defined by the GROUP BY clause (department_id), and aggregates the results again.
Because there's no way to syntactically divide GROUP BY
columns to inner aggregation and outer aggregation, an outer aggregation is performed with no GROUP BY
columns. You cannot place any non-aggregated columns in the SELECT
list in case of nested aggregation, because these columns are not available in the context of the outermost aggregation.
The question about ROWNUM
should be a different question, but an answer is the order of evaluation: WHERE
is evaluated before GROUP BY
.