I have a table with following data.
id country serial other_column
1 us 123 1
2 us 456 1
3 gb 123 1
4 gb 456 1
5 jp 777 1
6 jp 888 1
7 us 123 2
8 us 456 3
9 gb 456 4
10 us 123 1
11 us 123 1
Is there a way to fetch 2 rows per unique country and unique serial?
For example, expecting following results from my query.
us,123,1 comes twice cos there was 3 of the same kind and I want 2 rows per unique country and unique serial.
us,123,1
us,123,1
us,456,1
gb,123,1
gb,456,1
jp,777,1
jp,888,1
I can't use:
select distinct country, serial from my_table;
Since I want 2 rows per distinct value match for country and serial. Pls advice.
CodePudding user response:
Assign DENSE_RANK
and ROW_NUMBER
to your data set using a CTE
or subquery
then return rows with a ROW_NUMBER
less than 3
and a DENSE_RANK
equal to 1
. Also, since you did not specify the ORDERING
, I've added a custom ORDER BY
to handle your Country
sorting to match your desired output above.
SELECT
ID,
Country,
Serial,
other_column
FROM
(SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Country, Serial ORDER BY Country, Serial, other_column) AS RN,
DENSE_RANK() OVER (PARTITION BY Country, Serial ORDER BY Country, Serial, other_column) AS DR
FROM my_table) A
WHERE RN < 3 AND DR = 1
ORDER BY CASE WHEN Country = 'us' THEN 1
WHEN Country = 'gb' THEN 2
WHEN Country = 'jp' THEN 3
ELSE 4
END ASC, Country, Serial, other_column, ID
Result:
| ID | Country | Serial | other_column |
|----|---------|--------|---------------|
| 1 | us | 123 | 1 |
| 10 | us | 123 | 1 |
| 2 | us | 456 | 1 |
| 3 | gb | 123 | 1 |
| 4 | gb | 456 | 1 |
| 5 | jp | 777 | 1 |
| 6 | jp | 888 | 1 |
Fiddle here.
CodePudding user response:
- since you didn't specify any logic which rows it should pick based on the 'other column' value (i am assuming it doesn't matter for you).
Having said that, my code will always pick two rows based on unique country and serial with the other_column value as ascending. For example if you have 3 rows:
- us, 123, 1
- us, 123, 1
- us, 123, 2
it will go for first two since other_column value is set to ASC, if you want the other way around you can change the code to order by DESC
within the partition by clause.
If there are less than 3 rows for the country and serial it would just pick 1 row.
for example
- us, 456, 1
- us, 456, 1
- us, 123, 1
- us, 123, 2
- us, 123, 3
would result in:
- us, 456,1
- us, 123,1
- us, 123,2
with main as (
select
*,
count(*) over(partition by country, serial) as total_occurence,
row_number() over(partition by country, serial order by other_column) as rank_
from <table_name>
),
conditions as (
select *,
case when total_occurence < 3 and rank_ = 1 then true
when total_occurence >=3 and rank_ in (1,2) then true else
false end as is_relevant
from main
)
select * from conditions where is_relevant