Home > Software engineering >  Window Functions vs Group by
Window Functions vs Group by

Time:03-03

I am fairly new to SQL and data engineering. Recently, I happened to come across window functions and realized that the operation window functions perform can also be done with group by. Can someone here please explain the difference between the two and when one would use window functions over group by.

To mods - Please do not close this question. This question has not been posted before and I believe a good answer will bring a lot of value to the stack community. I scoured the internet looking for answers, but no website/article/video had a clear explanation and they only added to my confusion.

CodePudding user response:

In short and very simplified: window functions retain result row, group by squashes them. If the information you need relates to aggregation you need group by, if the information relates to particular row you need window function.

For example: supposed you have collection of employees with department and salaries. To know what is the average salary in department, you use select dept_id, avg(salary) from emp group by dept (aggregation). To know how each employee differs from that average, use select emp_id, salary - avg(salary) over (partition by dept_id) from emp (window function).

There are also several use cases which can be solved by both window functions and group by, for example select grp, min(value) ... group by grp is equivalent to select grp, value from (select grp, row_number() over (partition by grp order by value) as rn, value ...) x where x.rn = 1) so the border is somewhat fuzzy and other criteria (such as readability, extensibility or performance) have to help to decide.

  • Related