Home > Software design >  SQL-How to calculate most popular car brand by each State
SQL-How to calculate most popular car brand by each State

Time:01-18

State Brand Model
CA Ford F-150
FL Ford Mustang
CA Honda Civic
PA Dodge Charger
NY Nissan Altima
CA Ford F-150
NY Nissan Altima
PA Dodge Charger

So I tried asking this question before but I'm very new to SQL and to this site, so please forgive me for clarity and context if it's not enough. Big Query is the RDBMS being used.

Essentially I'm trying to group the best-selling brand by each state, regardless of model. An example is in CA, Ford appears twice out of the three times, and the third is Honda. So CA best selling brand is Ford. And by default for PA, it would be Dodge as they have appeared twice. Again, the Model can be ignored, I just put it for context, I'm just trying to compute the best-selling brand by each state that appears more than once per state.

Again, the model calculation is an added bonus but I'm just trying to group the state with the best-selling brand in that state that appears the most.

EDIT: The pasted-out table isn't coming out clean like the example one. But the expected out would be for state CA brand Ford as it appears two times out of the 3. For FL the default is Ford as it appears once, and for NY it's Nissan and PA Dodge.

CodePudding user response:

If you are looking for just an SQL Statement to do that, I think this is what you are looking for and it should work since Big Query is can run ANSI SQL compatible queries:

SELECT
       [State]
      ,[Brand]
      ,Count(Brand) AS BrandCount
  FROM 
       [dbo].[SoldCars]
  GROUP BY 
       [State],[Brand]
  ORDER BY
        [State], COUNT(Brand) DESC, [Brand]

Would output:

CA          Ford        2
CA          Honda       1
FL          Ford        1
NY          Nissan      2
PA          Dodge       2

Then you can re-arrange your order by however you like, and you can also add a HAVING clause, say you wanted to show only states that had 2 or more of each brand (manufacturer):

SELECT 
       [State]
      ,[Brand]
      , COUNT(Brand) AS BrandCount
  FROM 
       [dbo].[SoldCars]
  GROUP BY 
       [State],[Brand]
  HAVING 
       Count(Brand) >= 2
  ORDER BY
        [State], COUNT(Brand) DESC, [Brand]

Results:

CA          Ford        2
NY          Nissan      2
PA          Dodge       2

And if you wanted to add model, you add the Model column you can:

SELECT 
       [State]
      ,[Brand]
      ,[Model]
      ,Count(Model) AS ModelCount
  FROM 
       [dbo].[SoldCars]
  GROUP BY 
       [State],[Brand],[Model]
  ORDER BY
        [State],  COUNT(Model) DESC, [Model]

Results:

CA          Ford        F-150       2
CA          Honda       Civic       1
FL          Ford        Mustang     1
NY          Nissan      Altima      2
PA          Dodge       Charger     2

CodePudding user response:

You could use the QUALIFY clause on the DENSE_RANK analytic function ordered by COUNT(*) in descending order, the use ofDENSE_RANK is to return all rows that tie for the same max count.

SELECT State, Brand
FROM my_data
GROUP BY State, Brand
QUALIFY DENSE_RANK() OVER(PARTITION BY State ORDER BY COUNT(*) DESC) = 1

The output according to your sample input:

enter image description here

  • Related