Is it possible to optimize this Oracle SQL query any further (EXISTS)?-CodePudding

Hi hoping to have help to optimize the query below, it is taking 4 hours to run when comparing two larger tables with 55k and 1.6M rows respectively.

I have 2 tables Oppty and Acc and a query to derive the "CF" field (fiddle https://dbfiddle.uk/sRT7-kvz)

Both tables here

And below have a query deriving where if there is NO match in Account_ID between the tables it will say 'No Match @ ACC_ID Level'

If there IS a match at the ACC_ID level then, it will look at Prod1 match (anywhere in that ACC_ID), if NO Prod1 match it will just say 'Match @ ACC_ID Level'
If there IS a match at the ACC_ID and Prod1 level then it will look at Prod2 match (anywhere in that ACC_ID) and if NO is Prod2 match it will say 'Match @ ACC_ID, Prod1 Levels', and if there IS a Prod2 match it will say 'Match @ ALL Levels'

SELECT  
  op."Acc_ID"
, op."Oppty_ID"
, op."Prod1op"
, op."Prod2op"
, CASE WHEN EXISTS (
            SELECT 1 
            FROM Acc ac 
            WHERE ac."Acc_ID" = op."Acc_ID") 
                  THEN 
                      CASE WHEN EXISTS (
                      SELECT 1 
                      FROM Acc ac 
                      WHERE ac."Acc_ID" = op."Acc_ID"
                      AND ac."Prod1acc" = op."Prod1op") 
                          THEN 
                              CASE WHEN EXISTS (
                              SELECT 1 
                              FROM Acc ac 
                              WHERE ac."Acc_ID" = op."Acc_ID"
                              AND ac."Prod1acc" = op."Prod1op"
                              AND ac."Prod2acc" = op."Prod2op") 
                                  THEN 'Match @ ALL Levels' 
                          ELSE 'Match @ ACC_ID, Prod1 Levels' END
                  ELSE 'Match @ ACC_ID Level' END
       ELSE 'No Match @ ACC_ID Level' END CF
FROM Oppty op
  ORDER BY op."Acc_ID", op."Prod1op"

Output table here

CodePudding user response：

You have three, count'em, three nested correlated subqueries. It's almost certain that your execution plan ends up running each one for each outer query.

Refactor those correlated subqueries to LEFT JOINs and you'll recover some performance.

SELECT  DISTINCT
  op."Acc_ID"
, op."Oppty_ID"
, op."Prod1op"
, op."Prod2op"
, CASE WHEN ac1."Acc_ID" IS NOT NULL AND  ac2."Acc_ID" IS NOT NULL AND ac3."Acc_ID" IS NOT NULL
       THEN  'Match @ ALL Levels'
       WHEN ac1."Acc_ID" IS NOT NULL AND  ac2."Acc_ID" IS NOT NULL 
       THEN  'Match @ ACC_ID, Prod1 Levels'
       WHEN ac1."Acc_ID" IS NOT NULL
       THEN 'Match @ ACC_ID Level'
       ELSE 'No Match @ ACC_ID Level' END CF       
FROM Oppty op
LEFT JOIN Acc ac1 ON op."Acc_ID" = ac1."Acc_ID"
LEFT JOIN Acc ac2 ON op."Acc_ID" = ac2."Acc_ID"
                   AND ac2."Prod1acc" = op."Prod1op"
LEFT JOIN Acc ac3   ON op."Acc_ID" = ac3."Acc_ID"
                   AND ac3."Prod1acc" = op."Prod1op"
                   AND ac3."Prod2acc" = op."Prod2op" 
  ORDER BY op."Acc_ID", op."Prod1op"

Fiddle.

CodePudding user response：

1 option is to rewrite it as lateral, so you would have just one lookup to Acc per row from Oppty:

DBFiddle: https://dbfiddle.uk/gP5yyeoS

SELECT  
  op."Acc_ID"
, op."Oppty_ID"
, op."Prod1op"
, op."Prod2op"
, L.*
FROM Oppty op
     cross apply (
        SELECT
           case max(
                   case 
                     when ac."Prod1acc" = op."Prod1op" AND ac."Prod2acc" = op."Prod2op" then 3
                     when ac."Prod1acc" = op."Prod1op"                                  then 2
                     else 1
                   end)
                when 3 then 'Match @ ALL Levels' 
                when 2 then 'Match @ ACC_ID, Prod1 Levels'
                when 1 then 'Match @ ACC_ID Level'
                else 'No Match @ ACC_ID Level'
           end as CF
        FROM Acc ac 
        WHERE ac."Acc_ID" = op."Acc_ID"
     ) L
  ORDER BY op."Acc_ID", op."Prod1op"

Another variant is to join all rows and then reduce them to only max suited rows:

DBFiddle 2: https://dbfiddle.uk/1iSSy2Hr

See latest one there.

SELECT  
    op."Acc_ID"
  , op."Oppty_ID"
  , op."Prod1op"
  , op."Prod2op"
  , case
      max(
        case
             when ac."Acc_ID" = op."Acc_ID"
              AND ac."Prod1acc" = op."Prod1op"
              AND ac."Prod2acc" = op."Prod2op"
             then 3
             when ac."Acc_ID" = op."Acc_ID"
              AND ac."Prod1acc" = op."Prod1op"
             then 2
             when ac."Acc_ID" = op."Acc_ID"
             then 1
             else 0
         end
        )
      when 3 then 'Match @ ALL Levels' 
      when 2 then 'Match @ ACC_ID, Prod1 Levels'
      when 1 then 'Match @ ACC_ID Level'
      else 'No Match @ ACC_ID Level'
    end  as SF
FROM Oppty op
     left join Acc ac 
          on ac."Acc_ID" = op."Acc_ID"
group by 
    op."Acc_ID"
  , op."Oppty_ID"
  , op."Prod1op"
  , op."Prod2op"
  , op.rowid
ORDER BY op."Acc_ID", op."Prod1op"

In this case you can get simple Hash Join Right Outer with a very fast group-by.

You can also aggregate Acc to get just one row with all tuples of Prod1acc/Prod2acc, it should be even faster, but probably these 2 variants should be enough.

Just in case 3rd variant using collections (collections are not fast, usually it's better to aggregate to varchar2, but such approaches have length limits, so I wouldn't advice it here, since I don't know your details about your data):

DBFiddle 3: https://dbfiddle.uk/yoZfReji

with 
acc_agg as (
   select
      ac."Acc_ID"
     ,cast(collect(sys.ku$_ObjNumNam(1,ac."Prod1acc")) as sys.ku$_ObjNumNamSet) tuples_1
     ,cast(collect(sys.ku$_parsed_item(ac."Prod1acc",ac."Prod2acc",0)) as sys.ku$_parsed_items) tuples_2
   from acc ac
   group by ac."Acc_ID"
)
SELECT  
  op."Acc_ID"
, op."Oppty_ID"
, op."Prod1op"
, op."Prod2op"
, case 
     when aa."Acc_ID" = op."Acc_ID" 
       then 
         case 
            when sys.ku$_parsed_item(op."Prod1op",op."Prod2op",0) member of aa.tuples_2 
              then 'Match @ ALL Levels' 
            when sys.ku$_ObjNumNam(1,op."Prod1op") member of aa.tuples_1
              then 'Match @ ACC_ID, Prod1 Levels'
            else 'Match @ ACC_ID Level'
         end
     else 'No Match @ ACC_ID Level'
  end as SF
FROM Oppty op
     left join acc_agg aa
          on aa."Acc_ID" = op."Acc_ID"
ORDER BY op."Acc_ID", op."Prod1op";