Home > database >  SQL Oracle Like-for-like comparison
SQL Oracle Like-for-like comparison

Time:10-26

I have the following dataset:

Table1 enter image description here

2021 is my base year, and I want to know which of these products that started 2021 also occurred in 2022. So only if a product has a Contract_start in 2021, I want to pull this same product with contract_start in 2022 as well.

My output should therefore look like this enter image description here

I tried below query. This approach fails to include all four instances of product C, instead it only pulls one product C for 2021 and one for 2022. How to fix this query so that all products starting in 2021 are included in the right number of rows, without causing duplication for the 2022 products?

with values_2021 as
(       select distinct
        CUSTOMER_NUMBER||PRODUCT as productcust
        
        from table1
        
        where substr(CONTRACT_START,1,4) = '2021'
 )

SELECT distinct
        CUSTOMER_NUMBER,
        TOTAL_DOLLARS,        
        PRODUCT,
        CONTRACT_START,
        CONTRACT_END         
       
  FROM  table1
  join values_2021 on (table1.CUSTOMER_NUMBER||table1.PRODUCT) = values_2021.productcust
  
  where CONTRACT_START >= ('20210101')

CodePudding user response:

Not sure how performant this would be with your data, but hopefully it points you in the right direction.

SELECT b.*
FROM   (SELECT DISTINCT customer_number,
                        product
        FROM   table1
        WHERE  Substr(contract_start, 1, 4) = '2021') a
       inner join table1 b
               ON a.customer_number = b.customer_number
                  AND a.product = b.product
                  AND b.contract_start >= ( '20210101' ) 

CodePudding user response:

One option is to extend what you are doing using common table expressions. Create another CTE for 2022. Then, join both back to the main table. As jarlh mentioned, join both columns instead of using string concatenation. Switching the SUBSTR to LIKE could allow an index to be used if one exists on contract_start.

WITH values_2021 AS (
    SELECT customer_number
          ,product
      FROM table1
     WHERE contract_start LIKE '2021%'
     GROUP BY customer_number
             ,product
),values_2022 AS (
    SELECT customer_number
          ,product
      FROM table1
     WHERE contract_start LIKE '2022%'
     GROUP BY customer_number
             ,product
)
SELECT t1.customer_number
      ,t1.total_dollars
      ,t1.product
      ,t1.contract_start
      ,t1.contract_end
  FROM table1 t1
  JOIN values_2021 v2021 ON ( t1.customer_number = v2021.customer_number
   AND t1.product = v2021.product )
  JOIN values_2022 v2022 ON ( t1.customer_number = v2022.customer_number
   AND t1.product = v2022.product );
  • Related