Home > Software engineering >  Group table results with information from another
Group table results with information from another

Time:12-22

I'm trying to perform a query that returns an aggregation of values from the same table with information from others through a foreign key, but I can't. In the example below, I wanted to return the total sales by state on 2020-01-01 and 2021-01-01, showing the name of the state.

Tables script:

CREATE TABLE IF NOT EXISTS estado (
    id SERIAL PRIMARY KEY,
    estado VARCHAR(100)
)

CREATE TABLE IF NOT EXISTS municipio (
    id SERIAL PRIMARY KEY,
    estado integer REFERENCES estado(id),
    municipio VARCHAR(100)
)

CREATE TABLE IF NOT EXISTS vendas (
    id SERIAL PRIMARY KEY,
    municipio integer REFERENCES municipio(id),
    valor numeric,
    data_venda date
)

INSERT INTO estado VALUES (1, 'PR');
INSERT INTO estado VALUES (2, 'SC');
INSERT INTO estado VALUES (3, 'RS');
INSERT INTO municipio VALUES (1, 1, 'Pelotas');
INSERT INTO municipio VALUES (2, 1, 'Caxias do Sul');
INSERT INTO municipio VALUES (3, 1, 'Porto Alegre');
INSERT INTO municipio VALUES (4, 2, 'Florianopolis');
INSERT INTO municipio VALUES (5, 2, 'Chapeco');
INSERT INTO municipio VALUES (6, 2, 'Itajai');
INSERT INTO municipio VALUES (7, 3, 'Curitiba');
INSERT INTO municipio VALUES (8, 3, 'Maringa');
INSERT INTO municipio VALUES (9, 3, 'Foz do Iguaçu');
INSERT INTO vendas VALUES (1, 6, 5, '2020-01-01');
INSERT INTO vendas VALUES (2, 5, 10, '2021-01-01');
INSERT INTO vendas VALUES (3, 5, 5, '2020-01-01');
INSERT INTO vendas VALUES (4, 4, 2, '2020-01-01');
INSERT INTO vendas VALUES (5, 3, 10, '2021-01-01');
INSERT INTO vendas VALUES (6, 3, 12, '2020-01-01');
INSERT INTO vendas VALUES (7, 3, 20, '2020-01-01');
INSERT INTO vendas VALUES (8, 2, 10, '2020-01-01');
INSERT INTO vendas VALUES (9, 1, 11, '2021-01-01');
INSERT INTO vendas VALUES (10, 9, 4, '2020-01-01');

My attempt (absurd values and the RS ones do not appear):

SELECT 
    e.estado, SUM(v.valor) as sum2021, SUM(v2.valor) as sum2020
FROM vendas v 
    CROSS JOIN vendas v2 
    INNER JOIN municipio m ON v.municipio = m.id
    INNER JOIN estado e ON m.estado = e.id
WHERE v.data_venda = '2021-01-01' 
    AND v2.data_venda = '2020-01-01' 
GROUP BY 1;

Translating some terms:

município = city 
estado = state 
vendas = sales
valor = value 
data_venda = date of sale

CodePudding user response:

You're cross joining vendas with itself (as v1 and v2), meaning that each row from it will be matched with each other row (i.e., a Cartesian product), which creates the unexpected results you're seeing.

The good news is that you don't need this join. You can use an aggregate function (sum in this case) on a subset of the rows from the query using the filter clause:

SELECT 
    e.estado, 
    SUM(v.valor) FILTER (WHERE data_venda = '2021-01-01') AS sum2021, 
    SUM(v.valor) FILTER (WHERE data_venda = '2020-01-01') AS sum2020
FROM vendas v 
    INNER JOIN municipio m ON v.municipio = m.id
    INNER JOIN estado e ON m.estado = e.id
GROUP BY 
    e.estado;

SQLFiddle demo

  • Related