I have a question about joining 3 tables with some conditions.
I have 3 tables country, region, city. All I want is to return the largest city by population of every country and exclude the cities with 100.000 population or less. The thing is every table is connected with an id that is passed to the next table. Let me elaborate:
Country Table:
id | iso_code | name |
---|---|---|
1 | IN | India |
2 | US | United States |
Region Table:
id | country_id | name |
---|---|---|
3 | 1 | Delhi |
4 | 1 | Manipur |
5 | 2 | Nevada |
City Table:
id | region_id | name | population |
---|---|---|---|
6 | 3 | New Delhi | 320.000 |
7 | 3 | Delhi | 10.000.000 |
8 | 4 | Imphal | 220.000 |
9 | 5 | Las Vegas | 624.000 |
The output should look like this:
Country | City |
---|---|
India | Delhi |
United States | Las Vegas |
I have done so much research but I can't seem to find a solution. Thank you in advance.
CodePudding user response:
Join them from the foreign keys to the primary keys.
Add a ranking based on country and population.
Wrap it in a sub-query.
Then filter on it.
SELECT Country, City
FROM
(
SELECT
ctry.name AS Country
, ctry.iso_code AS CountryCode
, reg.name AS Region
, city.name AS City
, city.population AS CityPopulation
, DENSE_RANK() OVER (PARTITION BY ctry.id ORDER BY city.population DESC) AS Rnk
FROM Country AS ctry
JOIN Region AS reg ON reg.country_id = ctry.id
JOIN City AS city ON city.region_id = reg.id
) q
WHERE Rnk = 1
AND CityPopulation > 100000
ORDER BY CityPopulation DESC, Country, City
Note that I used DENSE_RANK instead of ROW_NUMBER.
The difference is that a DENSE_RANK would assign the same number to cities within a country with the same population. But a ROW_NUMBER wouldn't.
This way, if 2 cities are both the top in their country, they both get Rnk 1.
If you don't want that a country could potentially have more than 1 top country. Then use ROW_NUMBER instead, but with a tie-breaker.
F.e.
, ROW_NUMBER()
OVER (PARTITION BY ctry.id
ORDER BY city.population DESC,
LEN(city.name) DESC) AS Rnk