Home > Back-end >  UNION 2 tablesand remove duplicates based on some columns
UNION 2 tablesand remove duplicates based on some columns

Time:04-05

I'm writing SQL query in Python and I'm trying to UNION 2 tables and remove duplicates(duplicates are considered rows with same values in fields "id" and "product_id"), I found answers that NOT EXISTS is helpful, but I couldn't make it work properly and I have no idea where is the problem. This is my SQL query:

        SELECT cc."id", cc."default_name", cc."slug", cpc."product_id", ce."value" 
        FROM 
            "catalog_category" AS cc, 
            "catalog_product_categories" AS cpc, 
            "catalog_categoryentitydetailvaluevarchar" AS ce
        WHERE 
            ( 
                cc."id" = cpc."category_id" 
                AND cpc."product_id" IN {objects_id} 
                AND cc."id" = ce."parent_object_id" 
                AND cpc."category_id" = ce."parent_object_id" 
                AND ce."shop" = \'shop_name\' 
            ) 
        UNION ALL 
        SELECT cc2."id", cc2."default_name", cc2."slug", cpc2."product_id", ce2."value" 
        FROM 
            "catalog_category" AS cc2, 
            "catalog_product_categories" AS cpc2, 
            "catalog_categoryentitydetailvaluevarchar" AS ce2 
        WHERE 
            ( 
                cc2."id" = cpc2."category_id" 
                AND cpc2."product_id" IN {objects_id} 
                AND cc2."id" = ce2."parent_object_id" 
                AND cpc2."category_id" = ce2."parent_object_id" 
                AND ce2."shop" = \'default\' 
                AND NOT EXISTS(
                                SELECT 1 
                                FROM 
                                    "catalog_category" AS cc, 
                                    "catalog_product_categories" AS cpc, 
                                    "catalog_categoryentitydetailvaluevarchar" AS ce 
                                WHERE 
                                    cc2."id"=cc."id" 
                                    AND cpc2."product_id"=cpc."product_id" 
                                )
            ) 

But result is that with NOT EXISTS statement I receive only values from first SELECT and without NOT EXISTS i receive all values from both tables with all duplicates.

CodePudding user response:

U can modify the query as below.

;With cte1 as (SELECT cc."id", cc."default_name", cc."slug", cpc."product_id", ce."value" FROM "catalog_category" AS cc, "catalog_product_categories" AS cpc, "catalog_categoryentitydetailvaluevarchar" AS ce WHERE ( cc."id" = cpc."category_id" AND cpc."product_id" IN {objects_id} AND cc."id" = ce."parent_object_id" AND cpc."category_id" = ce."parent_object_id" AND ce."shop" = 'shop_name' )) Select * from cte1 UNION SELECT cc2."id", cc2."default_name", cc2."slug", cpc2."product_id", ce2."value" FROM "catalog_category" AS cc2, "catalog_product_categories" AS cpc2, "catalog_categoryentitydetailvaluevarchar" AS ce2 WHERE ( cc2."id" = cpc2."category_id" AND cpc2."product_id" IN {objects_id} AND cc2."id" = ce2."parent_object_id" AND cpc2."category_id" = ce2."parent_object_id" AND ce2."shop" = 'default' AND NOT EXISTS( SELECT 1 FROM Cte1 WHERE cc2."id"=cte1."id" AND cpc2."product_id"=cte1."product_id" )

  • Related