Home > Software engineering >  SQL query that returns wrong results
SQL query that returns wrong results

Time:07-26

I have two long SQL queries and I want to merge them into a single one. The first one is:

SELECT 
        c.customer_id, 
        c.customer_lastname, 
        c.customer_firstname, 
        c.customer_email, 
        c.customer_group_id, 
        country.country_name AS customer_country, 
        customer_county.county_name AS customer_county,
        customer_city.city_name AS customer_city, 
        ci.customer_street_address, 
        ci.customer_phone, 
        c.num_orders, 
        cc.company_name, 
        cc.company_vat_number, 
        cc.J, 
        country2.country_name AS company_country, 
        company_county.county_name, 
        company_city.city_name, 
        cc.company_street, 
        cc.company_zip, 
        cc.company_phone, 
        cc.company_fax, 
        cc.company_bankname, 
        cc.company_iban, 
        c.customer_external_ref 
    FROM 
        customer AS c 
            USING (customer_id) LEFT JOIN customer_address AS ci 
            USING (customer_id) LEFT JOIN customer_company AS cc 
            USING (customer_id) LEFT JOIN country_i18n AS country 
                ON ci.country_code=country.country_code 
                LEFT JOIN country_i18n AS country2 
                    ON cc.country_code = country2.country_code 
                LEFT JOIN county AS customer_county 
                    ON customer_county.county_id = ci.county_id 
                LEFT JOIN city AS customer_city 
                    ON customer_city.city_id = ci.city_id 
                LEFT JOIN county AS company_county 
                    ON company_county.county_id = cc.county_id 
                LEFT JOIN city AS company_city 
                    ON company_city.city_id = cc.city_id 
    WHERE 
        1 = 1 
    GROUP BY 
        customer_id 
    ORDER BY 
        customer_id DESC 

And the second one:

SELECT 
        customer_id, 
        customer_firstname, 
        customer_lastname, 
        customer_email, 
        MIN(order_datetime) as first_order_date, 
        MAX(order_datetime) as last_order_date, 
        DATEDIFF(NOW(),  MAX(order_datetime)) as days_since_last_order, 
        COUNT(DISTINCT o.order_id) as orders_count, 
        SUM(IF(product_id > 0, IF(product_quantity > 0, product_quantity, 0), 0)) AS sold_products_count, 
        SUM(IF(product_id > 0, IF(product_quantity < 0, -1 * product_quantity, 0), 0)) AS returned_products_count, 
        ROUND(100 * SUM(IF(product_id > 0, IF(product_quantity < 0, -1 * product_quantity, 0), 0)) 
            / SUM(IF(product_id > 0, IF(product_quantity > 0, product_quantity, 0), 0))) as returned_products_percent, 
        SUM(order_total) as orders_total 
    FROM 
        `order_products` op 
            INNER JOIN `order` o 
                ON (op.order_id = o.order_id) 
    WHERE 
        shop_module_class IS NULL 
    GROUP by 
        o.customer_id DESC 

And the final query looks something like this:

SELECT 
        c.customer_id, 
        c.customer_lastname, 
        c.customer_firstname, 
        c.customer_email, 
        c.customer_group_id, 
        MIN(o.order_datetime) as first_order_date, 
        MAX(o.order_datetime) as last_order_date, 
        DATEDIFF(NOW(), MAX(o.order_datetime)) as days_since_last_order, 
        COUNT(DISTINCT o.order_id) as orders_count, 
        SUM(IF(op.product_id > 0, IF(op.product_quantity > 0, op.product_quantity, 0), 0)) AS sold_products_count, 
        SUM(IF(op.product_id > 0, IF(op.product_quantity < 0, -1 * op.product_quantity, 0), 0)) AS returned_products_count, 
        ROUND(100 * SUM(IF(op.product_id > 0, IF(op.product_quantity < 0, -1 * op.product_quantity, 0), 0)) 
            / SUM(IF(op.product_id > 0, IF(op.product_quantity > 0, op.product_quantity, 0), 0))) as returned_products_percent, 
        SUM(o.order_total) as orders_total, 
        country.country_name AS customer_country, 
        customer_county.county_name AS customer_county, 
        customer_city.city_name AS customer_city, 
        ci.customer_street_address, 
        ci.customer_phone, 
        c.num_orders, 
        cc.company_name, 
        cc.company_vat_number, 
        cc.J, 
        country2.country_name AS company_country, 
        company_county.county_name, 
        company_city.city_name, 
        cc.company_street, 
        cc.company_zip, 
        cc.company_phone, 
        cc.company_fax, 
        cc.company_bankname, 
        cc.company_iban, 
        c.customer_external_ref     
    FROM 
        `order_products` AS op 
            LEFT JOIN `order` AS o USING (order_id) 
            LEFT JOIN customer AS c USING (customer_id) 
            LEFT JOIN customer_address AS ci USING (customer_id) 
            LEFT JOIN customer_company AS cc USING (customer_id) 
            LEFT JOIN country_i18n AS country 
                ON ci.country_code = country.country_code 
            LEFT JOIN country_i18n AS country2 
                ON cc.country_code = country2.country_code 
            LEFT JOIN county AS customer_county 
                ON customer_county.county_id = ci.county_id 
            LEFT JOIN city AS customer_city 
                ON customer_city.city_id = ci.city_id 
            LEFT JOIN county AS company_county 
                ON company_county.county_id = cc.county_id 
            LEFT JOIN city AS company_city 
                ON company_city.city_id = cc.city_id 
    WHERE 
        1 = 1 
    GROUP BY 
        customer_id 
    ORDER BY 
        customer_id DESC

But the issue is that the final query returns bad results, for example: after I run the second query I find out that a client with id = 1 has the order_total = 100 (this info is 100% correct because I checked the db). But after I run the third query, I find out that the client with id = 1 has the order_total = 600, when the accurate total should be 100. I don't know what I'm missing.

CodePudding user response:

In your join condition , one row is joining with multiple rows of other table and because of which you are getting duplicate entries.

CodePudding user response:

Without being able to run the query it is hard to be sure exactly.

Still my best guesstimate is that your underlying query holding sums based on product ids in order to get such as sold_product_counts, cause for your orders_total to be on each invoice line product of the order.

Try removoing the group by and commenting out your aggregate methods (min, max, count, sum) to look at the net set and just put an order on something like customer and order to reveal if that is why.

  • Related