Home > Net >  Partitions and ranking in Bigquery/SQL
Partitions and ranking in Bigquery/SQL

Time:07-15

Lets say I have a query pulling all countries, all of their cities, and each cities population.

I want it to only pull the largest city for each country. So it is limited to one city (the largest city) per country.

What is the best way to take the "MAX()" city for each country?

I am in BQ writing SQL

CodePudding user response:

Use below simple approach

select country, 
  array_agg(struct(city, population) order by population desc limit 1)[offset(0)].*
from your_table
group by country

CodePudding user response:

You could also use analytical window functions and the qualify clause:

SELECT  country,city,population
  FROM your_table
QUALIFY RANK() OVER (PARTITION BY country ORDER BY population DESC) = 1

Be aware in the unlikely case the two biggest cities of a country having equal population, you'll get two rows in the result set of this query. If you want to avoid this, use the ROW_NUMBER function instead, which in case of a tie chooses just one record. You could also add a tie-breaker to the order by clause of the window function.

  • Related