I have this dataset:
shopID supplier supply_count
1 a 12
2 b 12
2 f 12
2 e 10
3 b 5
3 a 2
4 f 15
4 c 11
I have used this query to get 2nd highest value for each group:
select s1.shopId, max(s1.supply_count)
from supply s1
where supply_count NOT IN (
select max(supply_count)
from supply s2
where s1.shopId = s2.shopId
)
group by s1.shopId
The results I'm getting are:
shopID supply_count
2 10
3 2
4 11
The desired output is: (in case there is no second highest I want to present the highest)
shopID supply_count
1 12
2 10
3 2
4 11
Based on this question: Retrieve 2nd highest count by each group
CodePudding user response:
You can get desired output to include the value where each shopId has only a single row by also counting the rows for each ShopId.
Based on some more varied data and comments I believe the following should give the desired results:
with c as (
select *,
case when dense_rank() over(partition by shopid order by supply_count desc)=2
or Min(supply_count) over(partition by shopid)
=Max(supply_count) over(partition by shopid)
then 1 else 0 end as valid
from t
)
select distinct shopId, supply_count
from c
where valid=1
Example DBFiddle with some more varied data
CodePudding user response:
select
shopID,
supply_count
from
(select shopID,
supply_count, dense_rank() over(partition by shopID order by supply_count desc) as rnk
from supply) a
where rnk=2
union
select
shopID,
supply_count
from
(select shopID,
supply_count, dense_rank() over(partition by shopID order by supply_count desc) as rnk
from supply) a
group shopID,
supply_count having max(rnk)=1;
I believe this should work. What I have tried here is to identify the second highest supply count by assigning a rank for each group and doing a union with the shopIDs that have only one record.
If you don't understand what a dense_rank()
does, read this : https://towardsdatascience.com/how-to-use-sql-rank-and-dense-rank-functions-7c3ebf84b4e8