I have a dataset that is similar to this. I need to pick out the most recent metadata (greater execution time = more recent) for a client including the sum of quantities and the latest execution time and meta where the quantity > 0
| Name | Quantity | Metadata | Execution time |
| -------- | ---------|----------|----------------|
| Neil | 1 | [1,3] | 4 |
| James | 1 | [2,18] | 5 |
| Neil | 1 | [4, 1] | 6 |
| Mike | 1 | [5, 42] | 7 |
| James | -1 | Null | 8 |
| Neil | -1 | Null | 9 |
Eg the query needs to return:
| Name | Summed Quantity | Metadata | Execution time |
| -------- | ----------------|----------|----------------|
| James | 0 | [2,18] | 5 |
| Neil | 1 | [4, 1] | 6 |
| Mike | 1 | [5, 42] | 7 |
My query doesn't quite work as it's not returning the sum of the quantities correctly.
SELECT
distinct on (name) name,
(
SELECT
cast(
sum(quantity) as int
)
) as summed_quantity,
meta,
execution_time
FROM
table
where
quantity > 0
group by
name,
meta,
execution_time
order by
name,
execution_time desc;
This query gives a result of
| Name | Summed Quantity | Metadata | Execution time |
| -------- | ----------------|----------|----------------|
| James | 1 | [2,18] | 5 |
| Neil | 1 | [4, 1] | 6 |
| Mike | 1 | [5, 42] | 7 |
ie it's just taking the quantity > 0 from the where and not adding up the quantities in the sub query (i assume because of the distinct clause) I'm unsure how to fix my query to produce the desired output.
CodePudding user response:
I think you need inner join with custome table that make sum of quantity :
SELECT distinct on (table.name) table.name, t.sum_qte, metadata, execution
from table
inner join
(select sum(quantity) sum_qte, name from table where quantity > 0 group by name) t on table.name = t.name
where quantity > 0
CodePudding user response:
This can be achieved using window functions (hence with a single pass of the data)
select
name
, sum_qty
, metadata
, execution_time
from (
select
name, metadata, execution_time
, sum(quantity) over(partition by name) as sum_qty
, row_number() over(partition by name order by execution_time DESC) as rn
from mytable
where quantity > 0
) as d
where rn = 1
order by name
result
------- --------- ---------- ----------------
| name | sum_qty | metadata | execution_time |
------- --------- ---------- ----------------
| James | 1 | [2,18] | 5 |
| Mike | 1 | [5,42] | 7 |
| Neil | 2 | [4,1] | 6 |
------- --------- ---------- ----------------
note: the summed quantity for James and Neil don't agree with the expected result - but I believe the result here is correct, see below:
| James | 1 | [2,18] | 5 | -- "most recent"
| Neil | 1 | [1,3] | 4 |
| Neil | 1 | [4, 1] | 6 | -- "most recent"
db<>fiddle here