Home > Back-end >  How to find the average of all other products in postgresql
How to find the average of all other products in postgresql

Time:11-11

I have a table Products which looks like the following:

 ----------- ----------- ---------- 
|ProductCode|ProductType|   ....   |
 ----------- ----------- ---------- 
|   ref01   |   BOOKS   |   ....   |
|   ref02   |   ALBUMS  |   ....   |
|   ref06   |   BOOKS   |   ....   |
|   ref04   |   BOOKS   |   ....   |
|   ref07   |   ALBUMS  |   ....   |
|   ref10   |   TOYS    |   ....   |
|   ref13   |   TOYS    |   ....   |
|   ref09   |   ALBUMS  |   ....   |
|   ref29   |   TOYS    |   ....   |
|   .....   |   .....   |   ....   |
 ----------- ----------- ---------- 

Another table Sales which looks like the following:

 ----------- ----------- ---------- 
|ProductCode|   Orders  |   ....   |
 ----------- ----------- ---------- 
|   ref01   |     15    |   ....   |
|   ref02   |     12    |   ....   |
|   ref06   |     20    |   ....   |
|   ref04   |     14    |   ....   |
|   ref07   |     11    |   ....   |
|   ref10   |     19    |   ....   |
|   ref13   |      3    |   ....   |
|   ref09   |      9    |   ....   |
|   ref29   |      5    |   ....   |
|   .....   |   .....   |   ....   |
 ----------- ----------- ---------- 

I am trying to find the products that were ordered more than the average of all other products of the same type.

By manually calculating, the result would be something like:

 ----------- ----------- ---------- 
|ProductCode|   Orders  |   ....   |
 ----------- ----------- ---------- 
|   ref02   |     12    |   ....   |
|   ref06   |     20    |   ....   |
|   ref07   |     11    |   ....   |
|   ref10   |     19    |   ....   |
|   .....   |   .....   |   ....   |
 ----------- ----------- ---------- 

So if looking in the type ALBUMS and product ref02, then I need to find the average of Orders of ALL OTHER ALBUMS. In this case, it is the average of ref06 and ref04, but there are more in the actual table. So what I need to do is the following:

Since product ref02 is 'ALBUMS', and ref07 and ref09 are also 'ALBUMS'. 
      So their average is (11 9)/2=10 <12.

Since product ref06 is 'BOOKS', and **ref01** and ref04 are also 'BOOKS'.                     
      So their average is (15 14)/2=14.5 <20.

Since product ref07 is 'ALBUMS', and **ref02** and ref09 are also 'ALBUMS'.           
      So their average is (12 9)/2=10.5<11.

Since product ref10 is 'TOYS', and ref13 and ref29 are also 'TOYS'           
      So their average is (3 5)/2=4<19.

The rest does not satisfy the condition thus will not be in the result.

I know how to and was able to find the average of orders for all products under the same type, but I have no idea how to find the average of orders for all other products under the same type.

I am using PostgreSQL, but cannot use any of these key words: WITH, OVER, LIMIT, PARTITION.

CodePudding user response:

We can try using AVG() as an analytic function here:

WITH cte AS (
    SELECT p.ProductCode, p.ProductType, s.Orders,
           AVG(s.Orders) OVER (PARTITION BY p.ProductType) AS AvgOrders
    FROM Products p
    INNER JOIN Sales s ON s.ProductCode = p.ProductCode
)

SELECT ProductCode, ProductType, Orders
FROM cte
WHERE Orders > AvgOrders;

screen capture from demo link below

Demo

CodePudding user response:

This is what I would use in Postgres 11 or later:
A solution with a window function using a custom frame definition:

SELECT product_code, orders
FROM  (
   SELECT product_code, s.orders
        , avg(orders) OVER (PARTITION BY p.product_type
                            ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
                            EXCLUDE CURRENT ROW) AS avg_orders
   FROM   product p
   JOIN   sales s USING (product_code)
   ) sub
WHERE  orders > avg_orders 
ORDER  BY product_code;  -- optional

I expect this to perform best.
Read the manual for details on window functions.

This is what you are asking for:
A solution without CTE, window function, or LIMIT (works in any modern Postgres version):

SELECT product_code, s.orders
FROM   product p
JOIN   sales s USING (product_code)
JOIN   LATERAL (
   SELECT avg(orders) AS avg_orders
   FROM   product p1
   JOIN   sales s1 USING (product_code)
   WHERE  p1.product_type =  p.product_type
   AND    p1.product_code <> p.product_code
   ) a ON a.avg_orders < s.orders
ORDER  BY product_code;  -- optional

Either produces your desired result.

db<>fiddle here

About LATERAL:

  • Related