Home > Software design >  Postgres distinct rows whilst also summing
Postgres distinct rows whilst also summing

Time:07-07

I have a dataset that is similar to this. I need to pick out the most recent metadata (greater execution time = more recent) for a client including the sum of quantities and the latest execution time and meta where the quantity > 0

| Name     | Quantity | Metadata | Execution time |
| -------- | ---------|----------|----------------|
| Neil     | 1        | [1,3]    |  4             |
| James    | 1        | [2,18]   |  5             |
| Neil     | 1        | [4, 1]   |  6             |
| Mike     | 1        | [5, 42]  |  7             |
| James    | -1       | Null     |  8             |
| Neil     | -1       | Null     |  9             |

Eg the query needs to return:

| Name     | Summed Quantity | Metadata | Execution time |
| -------- | ----------------|----------|----------------|
| James    | 0               | [2,18]   |  5             |
| Neil     | 1               | [4, 1]   |  6             |
| Mike     | 1               | [5, 42]  |  7             |

My query doesn't quite work as it's not returning the sum of the quantities correctly.

SELECT 

distinct on (name) name, 
(
    SELECT 
      cast(
        sum(quantity) as int
        
      )
  ) as summed_quantity,
     
  meta,
  execution_time  
FROM 
  table 
where 
 quantity > 0
group by 
  name,  
  meta, 
  execution_time 
order by 
  name, 
  execution_time desc;

This query gives a result of

| Name     | Summed Quantity | Metadata | Execution time |
| -------- | ----------------|----------|----------------|
| James    | 1               | [2,18]   |  5             |
| Neil     | 1               | [4, 1]   |  6             |
| Mike     | 1               | [5, 42]  |  7             |

ie it's just taking the quantity > 0 from the where and not adding up the quantities in the sub query (i assume because of the distinct clause) I'm unsure how to fix my query to produce the desired output.

CodePudding user response:

I think you need inner join with custome table that make sum of quantity :

SELECT distinct on (table.name) table.name, t.sum_qte, metadata, execution 
from table
inner join 
(select sum(quantity) sum_qte, name from table where quantity > 0 group by name) t on table.name = t.name 
where quantity > 0 

CodePudding user response:

This can be achieved using window functions (hence with a single pass of the data)

select
       name
     , sum_qty
     , metadata
     , execution_time
from (
     select 
            name, metadata, execution_time
          , sum(quantity) over(partition by name) as sum_qty
          , row_number() over(partition by name order by execution_time DESC) as rn
     from mytable
     where quantity > 0 
     ) as d
where rn = 1
order by name

result

 ------- --------- ---------- ---------------- 
| name  | sum_qty | metadata | execution_time |
 ------- --------- ---------- ---------------- 
| James |       1 | [2,18]   |              5 |
| Mike  |       1 | [5,42]   |              7 |
| Neil  |       2 | [4,1]    |              6 |
 ------- --------- ---------- ---------------- 

note: the summed quantity for James and Neil don't agree with the expected result - but I believe the result here is correct, see below:

| James    | 1        | [2,18]   |  5             | -- "most recent"

| Neil     | 1        | [1,3]    |  4             |
| Neil     | 1        | [4, 1]   |  6             | -- "most recent"

db<>fiddle here

  • Related