Home > Blockchain >  (SQL) How do you select a max float value along with other datatypes values within a query?
(SQL) How do you select a max float value along with other datatypes values within a query?

Time:09-04

I'm working with the Iowa Liquor Sales dataset which in this case is called "bigquery-public-data.iowa_liquor_sales.sales". Relevant columns and their datatypes are date(DATE), sale_dollars(FLOAT), item_description(STRING), store_name(STRING).

I am trying to write a query that will return the top sale for each year, of the past three years (2021,2020,2019) along with the date, item_description, and store_name.

The below code works, but only covers one year. I know I could copy paste and change the date every time but that seems tedious. Is there a better way?

SELECT date, sale_dollars, item_description, store_name
FROM `bigquery-public-data.iowa_liquor_sales.sales`
WHERE date between '2021-01-01' and '2021-12-31'
ORDER BY sale_dollars DESC
LIMIT 1
date sale_dollars item_description store_name
2021-04-19 250932.0 Titos Handmade Vodka Hy-Vee #3

When trying different ways to write it so the max sale of 2019,2020, and 2021 return along with their date, item_description, and store_name, I ran into errors. The below is the closest I got (missing date, item_description, and store_name).

SELECT

(SELECT MAX(sale_dollars)
FROM `bigquery-public-data.iowa_liquor_sales.sales`
WHERE date between '2021-01-01' and '2021-12-31') as sale_2021,

(SELECT MAX(sale_dollars)
FROM `bigquery-public-data.iowa_liquor_sales.sales`
WHERE date between '2020-01-01' and '2020-12-31') as sale_2020,

(SELECT MAX(sale_dollars)
FROM `bigquery-public-data.iowa_liquor_sales.sales`
WHERE date between '2019-01-01' and '2019-12-31') as sale_2019

How can I write a query that returns the max sale of the past three years along with it's date, item, and store name?

CodePudding user response:

Consider below query

SELECT EXTRACT(YEAR FROM date) year, 
       ARRAY_AGG(
         STRUCT(date, sale_dollars, item_description, store_name) 
         ORDER BY sale_dollars DESC LIMIT 1
       )[OFFSET(0)].*
  FROM `bigquery-public-data.iowa_liquor_sales.sales`
 WHERE date BETWEEN '2019-01-01' AND '2021-12-31'
 GROUP BY 1;
Query results
 ------ ------------ -------------- ---------------------- ------------------------------- 
| year |    date    | sale_dollars |   item_description   |          store_name           |
 ------ ------------ -------------- ---------------------- ------------------------------- 
| 2020 | 2020-10-08 |     250932.0 | Titos Handmade Vodka | Hy-Vee #3 / BDI / Des Moines  |
| 2019 | 2019-10-08 |      78435.0 | Makers Mark          | Hy-Vee Food Store / Urbandale |
| 2021 | 2021-07-05 |     250932.0 | Titos Handmade Vodka | Hy-Vee #3 / BDI / Des Moines  |
 ------ ------------ -------------- ---------------------- ------------------------------- 

or, you can get same result with a window function

 SELECT date, sale_dollars, item_description, store_name
   FROM `bigquery-public-data.iowa_liquor_sales.sales`
  WHERE date BETWEEN '2019-01-01' AND '2021-12-31'
QUALIFY ROW_NUMBER() OVER (
          PARTITION BY EXTRACT(YEAR FROM date) ORDER BY sale_dollars DESC
        ) = 1;

CodePudding user response:

As the three values deliver only one value, you can add them to the first query, only adapted to three years

SELECT 
    date, sale_dollars, item_description, store_name,
    (SELECT MAX(sale_dollars)
    FROM `bigquery-public-data.iowa_liquor_sales.sales`
    WHERE date between '2021-01-01' and '2021-12-31') as sale_2021,    
    (SELECT MAX(sale_dollars)
    FROM `bigquery-public-data.iowa_liquor_sales.sales`
    WHERE date between '2020-01-01' and '2020-12-31') as sale_2020,    
    (SELECT MAX(sale_dollars)
    FROM `bigquery-public-data.iowa_liquor_sales.sales`
    WHERE date between '2019-01-01' and '2019-12-31') as sale_2019
FROM `bigquery-public-data.iowa_liquor_sales.sales`
WHERE date between '2019-01-01' and '2021-12-31'
ORDER BY sale_dollars DESC
LIMIT 1
  • Related