I am looking for a SQL query that given a table with data
---------------- -------------- ---
|category |id |time|
---------------- -------------- ---
|A |abc| 1 |
|A |abc| 1 |
|B |abc| 3 |
|C |abc| 4 |
|A |xyz| 4 |
|B |xyz| 5 |
|C |xyz| 7 |
|C |xyz| 7 |
---------------- --------------- ---
will result in output
---------------- --------------- ---
|category |id |cnt|
---------------- --------------- ---
|A |xyz| 1 |
|B |xyz| 1 |
|C |xyz| 2 |
---------------- --------------- ---
I get up to this:
select category, id, count(*) as cnt
from table
group by category, id
Now, I need to know which id
is more recent (greater) based on the time
column, so I can filter only this subset. time
column can only increase.
Is there better alternative than
select category, id, count(*) as cnt
from table
where id=(select id from table order by time desc limit 1)
group by category, id
?
CodePudding user response:
with data(category, id, time) as (
select * from values
('A', 'abc', 1),
('A', 'abc', 1),
('B', 'abc', 3),
('C', 'abc', 4),
('A', 'xyz', 4),
('B', 'xyz', 5),
('C', 'xyz', 7),
('C', 'xyz', 7)
)
select
category,
id,
count(*) as cnt
from data
where id = (select id from data order by time desc limit 1)
group by 1,2;
works as you note:
CATEGORY | ID | CNT |
---|---|---|
A | xyz | 1 |
B | xyz | 1 |
C | xyz | 2 |
another method is to convert that WHERE into a JOIN:
select
d.category,
d.id,
count(*) as cnt
from data as d
join (
select z.category, z.id
from data as z
qualify row_number() over (partition by z.category order by z.time desc) = 1
) as c
on d.category = c.category and d.id = c.id
group by 1,2;
I would be inclined to test on your data, to see which works better for you