Home > front end >  Decluttering a SQL query
Decluttering a SQL query

Time:02-19

For a practice project i wrote the following query and i was wondering if there is way to make it more efficient than writing everything 12 times like a for loop for sql.

CREATE TABLE temp (month INT, total_sales INT, market_share decimal(5,2), year_change decimal(5,2))

insert into temp (month)
Values (1)

UPDATE temp
SET total_sales = (
    SELECT COUNT(purchases_2020.purchaseid)
    FROM purchases_2020
    JOIN categories ON purchases_2020.purchaseid = categories.purchase_id
    WHERE (categories.category = 'whole milk' OR categories.category = 'yogurt' OR categories.category = 'domestic eggs') AND (purchases_2020.fulldate BETWEEN '2020-01-01' AND '2020-01-31')
) 
WHERE month = 1

UPDATE temp
SET market_share = (
        SELECT (SELECT 100 * COUNT(purchases_2020.purchaseid)
                FROM purchases_2020
                JOIN categories ON purchases_2020.purchaseid = categories.purchase_id
                WHERE (categories.category = 'whole milk' OR categories.category = 'yogurt' OR categories.category = 'domestic eggs') AND (purchases_2020.fulldate BETWEEN '2020-01-01' AND '2020-01-31'))
       * 1. /  
               (SELECT COUNT(purchases_2020.purchaseid) 
               FROM purchases_2020
               WHERE purchases_2020.fulldate BETWEEN '2020-01-01' AND '2020-01-31')
) 
WHERE month = 1

UPDATE temp
SET year_change = (
    SELECT market_share - 
        (SELECT 
            (SELECT 100 * COUNT(purchases_2019.purchase_id)
            FROM purchases_2019
            JOIN categories ON purchases_2019.purchase_id = categories.purchase_id
            WHERE (categories.category = 'whole milk' OR categories.category = 'yogurt' OR categories.category = 'domestic eggs') AND (purchases_2019.full_date BETWEEN '2019-01-01' AND '2019-01-31'))
        * 1./  
            (SELECT COUNT(purchases_2019.purchase_id) 
            FROM purchases_2019
            WHERE purchases_2019.full_date BETWEEN '2019-01-01' AND '2019-01-31'))
    FROM temp
    WHERE month = 1
) 
WHERE month = 1

EDIT

I was given the 3 tables represented on the following database schema , and im trying to create a table with the total sales of dairy every month, the monthly market share of the dairy products and the difference between the 2020 monthly market share and the 2019 monthly market share (the year change colunm)

There is also an aritmethic error somewhere, when checking the project i get the following message ResultSet does not contain the correct numeric values! and im at my wits end looking for it butmy priority is to decluter the query.

CodePudding user response:

Your error message tells me that you are trying to run this from a reporting tool or a host language.

It also makes no sense to put the data into separate tables by years.

SQL is a declarative language that works with data as sets.

Instead of pushing the results into table temp, try writing a query like this:

with all_data as (
  select p.fulldate, p.purchaseid, c.category,
         extract(year  from p.fulldate) as year,
         extract(month from p.fulldate) as month
    from purchases_2020 p
         join categories c on c.purchase_id = p.purchaseid
  union all
  select p.fulldate, p.purchaseid, c.category,
         extract(year  from p.fulldate) as year,
         extract(month from p.fulldate) as month
    from purchases_2019 p
         join categories c on c.purchase_id = p.purchaseid
), kpis as (
  select year, month, 
         count(purchaseid) 
           filter (where category in ('whole milk', 'yogurt', 'domestic eggs'))
             as dairy_sales,
         count(purchaseid) * 1.0 as total_sales
    from all_data
)
select ty.month, ty.dairy_sales as total_sales,
       100.0 * ty.dairy_sales / ty.total_sales as market_share,
       100.0 * (   (ty.dairy_sales / ty.total_sales) 
                 - (ly.dairy_sales / ly.total_sales)) as year_change
  from kpis ty
       join kpis ly
         on (ly.year, ly.month) = (ty.year - 1, ty.month);

  • Related