Oracle SQL -> nested group functions and weird behaviour-CodePudding

I am getting familiar with pl\sql and I have a question about certain task.

I want to find the job with the lowest average salary, this is the solution from the script:

SELECT job_id, AVG(salary)
FROM employees
GROUP BY job_id
HAVING AVG(salary) = (SELECT MIN(AVG(salary))
                      FROM employees
                      GROUP BY job_id);

and the answer is job_id -> PU_CLERK, AVG(salary) -> 2700

My first question is why this code won't work:

SELECT job_id, MIN(AVG(salary))
FROM employees
GROUP BY job_id

I get an "not a single-group group function" error. By googeling the error, almost every solution is: To resolve the error, you can either remove the group function or column expression from the SELECT clause or you can add a GROUP BY clause that includes the column expressions.

Then why isn't this working? I have column expression in group by clause. Someone already posted about this but everyone just gave different solution. I would really appreciate if someone can actually explain.

With this query:

SELECT job_id, AVG(salary)
FROM employees
GROUP BY job_id
ORDER BY(salary)

I get a list of average salaries and PU_CLERK is the lowest with 2700, just like in the solution. But, if a limit a number of rows displayed with ROWNUM, i get different solution. So, this query:

SELECT job_id, AVG(salary)
FROM employees
WHERE ROWNUM = 1
GROUP BY job_id
ORDER BY(salary)

gives solution of SH_CLERK and 2600. Why is that? I tried working with FETCH but that just didn't work.

Also, this query:

SELECT MIN(AVG(salary))
FROM employees
GROUP BY job_id

gives solution of 2700, but it is missing a job_id so I can know which job position that actually is.

Thanks in advance to everyone for the answers and comments!

CodePudding user response：

It is not working as you are effectively doing:

SELECT job_id,
       MIN(avg_sal)
FROM   (
  SELECT job_id,
         AVG(salary) AS avg_sal
  FROM   employees
  GROUP BY job_id
)

And, while there is a GROUP BY clause on the inner sub-query there is not one on the outer query and the job_id column is not aggregated.

If you want to aggregate the outer query then:

SELECT MIN(job_id) KEEP (DENSE_RANK FIRST ORDER BY avg_sal),
       MIN(avg_sal)
FROM   (
  SELECT job_id,
         AVG(salary) AS avg_sal
  FROM   employees
  GROUP BY job_id
)

or, without the sub-query:

SELECT MIN(MIN(job_id)) KEEP (DENSE_RANK FIRST ORDER BY AVG(salary)) AS job_id,
       MIN(AVG(salary))
FROM   employees
GROUP BY job_id

In which, everything is aggregated twice.

Or, from Oracle 12, you can use FETCH FIRST ROW ONLY and order by the AVG(salary):

SELECT job_id, AVG(salary)
FROM employees
GROUP BY job_id
ORDER BY AVG(salary)
FETCH FIRST ROW ONLY;

or you can use ROWNUM with a nested sub-query:

SELECT *
FROM   (
  SELECT job_id, AVG(salary)
  FROM employees
  GROUP BY job_id
  ORDER BY AVG(salary)
)
WHERE ROWNUM = 1;

db<>fiddle here

CodePudding user response：

This is described in the documentation:

You can nest aggregate functions. For example, the following example calculates the average of the maximum salaries of all the departments in the sample schema hr:
SELECT AVG(MAX(salary))
 FROM employees
 GROUP BY department_id;

AVG(MAX(SALARY))
----------------
     10926.3333
This calculation evaluates the inner aggregate (MAX(salary)) for each group defined by the GROUP BY clause (department_id), and aggregates the results again.

Because there's no way to syntactically divide GROUP BY columns to inner aggregation and outer aggregation, an outer aggregation is performed with no GROUP BY columns. You cannot place any non-aggregated columns in the SELECT list in case of nested aggregation, because these columns are not available in the context of the outermost aggregation.

The question about ROWNUM should be a different question, but an answer is the order of evaluation: WHERE is evaluated before GROUP BY.