Home > Enterprise >  BigQuery - Grouping by date with a WHERE clause
BigQuery - Grouping by date with a WHERE clause

Time:02-08

My goal is to search for certain products, count the total per certain product, and group each by day.

Schema is like so:

line_items.sku created_at
product1 2020-04-02T13:22:44
product2 2020-04-02T05:01:22
product2 2020-04-03T14:21:10

My query is below:

SELECT
  EXTRACT(DAY
  FROM
    CAST(`order`.created_at AS DATETIME)) AS day_extracted,
  EXTRACT(MONTH
  FROM
    CAST(`order`.created_at AS DATETIME)) AS month_extracted,
  `order`.line_items.sku AS sku
FROM
  `mydatabase`
WHERE
  `order`.line_items.sku = "product 1"
  OR `order`.line_items.sku = "product 2"

The data is below:

row day_extracted month_extracted sku
1 5 2 product1
2 4 1 product2
2 4 1 product1

This is great and works, but I'm running into issues with needing to grouping the products and count per product total per day.

What am I doing wrong? If I add

  GROUP BY month_extracted, day_extracted

to the query, another error comes up

SELECT list expression references `order`.line_items which is neither grouped nor aggregated at [8:3]

Line 8 is:

`order`.line_items.sku AS sku

CodePudding user response:

The order in which a general SQL query is evaluated is thisenter image description here

Which means the group by clause doesn't even know what is month_extracted, day_extracted. So in order to fix this, either put the whole exp EXTRACT(.. in Group by OR. Use a subquery. And also there is a rule, that anything in SELECT which is not part of GROUP BY should be applied an AGGREGATE function. In your it is not hence an error.

select 
       day_extracted,
       month_extracted,
       any_value(sku) AS sku -- i used any_value to fix it, you can use any other agg. function as per your logic
 from (
SELECT
  EXTRACT(DAY
  FROM
    CAST(`order`.created_at AS DATETIME)) AS day_extracted,
  EXTRACT(MONTH
  FROM
    CAST(`order`.created_at AS DATETIME)) AS month_extracted,
    `order`.line_items.sku as Sku
  
FROM
  `mydatabase`
WHERE
  `order`.line_items.sku = "product 1"
  OR `order`.line_items.sku = "product 2"
) as _table
group by day_extracted,month_extracted

CodePudding user response:

Mr.Batra led me down the rabbit hole of subqueries and that led me to my solution. Knowing which order queries are executed in made more sense now too.

SELECT day_extracted,month_extracted,Sku,count(*) FROM 
    (
SELECT
    EXTRACT(DAY
    FROM
      CAST(`order`.created_at AS DATETIME)) AS day_extracted,
    EXTRACT(MONTH
    FROM
      CAST(`order`.created_at AS DATETIME)) AS month_extracted,
    `order`.line_items.sku AS Sku
  FROM
    `mydatabase`
  WHERE
    `order`.line_items.sku = "product1"
    OR `order`.line_items.sku = "product2"
    ) AS temp
    GROUP BY temp.Sku,day_extracted,month_extracted
    ORDER BY day_extracted

This gives me the data in this format:

day_extracted month_extracted Sku col1
1 2 product1 41
1 2 product2 55
2 2 product1 91
  •  Tags:  
  • Related