Home > front end >  Union All alternative
Union All alternative

Time:03-11

I am trying to rewrite a query by removing union all for improving the performance. But not getting clear picture to remove this. Any suggestion would do good. Thanks in Advance.

select a.*,"no" as indicator,a.num from a where a=b 
union all 
select a.*,"yes" as indicator,a.num  from a where a=c 
union all
select a.*,"no" as indicator,"All" as num from a where a=b
union all
select a.*,"yes" as indicator,"All" as num from a where a=c  ;

CodePudding user response:

First let me state that the best optimization would be if you re-think you approach and change the query to avoid the redundant output. - the 3th and 4th provide no new information that is not contained in the first two selects.

Anyway if you want to optimize - this is your approach:

The original query makes four time full table scan on your table - you want to do only one.

Sample Data

create table tab as
select 1 a, 2 b, 3 c, '1' num from dual union all
select 1 a, 1 b, 0 c, '2' num from dual union all
select 1 a, 0 b, 1 c, '3' num from dual union all
select 1 a, 0 b, 1 c, '4' num from dual union all
select 1 a, 1 b, 1 c, '5' num from dual;

The query leads to the following output

select a.a, a.b, a.c, 'no' as indicator,a.num from tab a where a=b 
union all 
select a.a, a.b, a.c, 'yes' as indicator,a.num  from tab a where a=c 
union all
select a.a, a.b, a.c, 'no' as indicator,'All' as num from tab a where a=b
union all
select a.a, a.b, a.c, 'yes' as indicator,'All' as num from tab a where a=c
order by 5,4,1,2,3;



         A          B          C IND NUM
---------- ---------- ---------- --- ---
         1          1          0 no  2  
         1          0          1 yes 3  
         1          0          1 yes 4  
         1          1          1 no  5  
         1          1          1 yes 5  
         1          1          0 no  All
         1          1          1 no  All
         1          0          1 yes All
         1          0          1 yes All
         1          1          1 yes All

New Query

First scan the table and define the column a_eq_b and a_eq_c in case that the conditions a=bresp. a=c holds.

Than join to the yn table containing two rows yesand no with a join condition that either suppress the row if no condition is met or result in one or two row join simulation your original union all.

Finaly introduce your redundancy duplication all rows with the additional All record joining to the table Allwith two rows.

This query makes only one FTS and provides identical result.

with yn as (
select 'yes' indicator from dual union all
select 'no'  from dual),
num as (
select  'All' num from dual union all
select   null from dual),
dt as
(select 
  A, B, C, NUM,
  case when a=b then 'Y' end a_eq_b,
  case when a=c then 'Y' end a_eq_c
from tab)
select dt.A, dt.B, dt.C,
yn.indicator,
nvl(num.num,dt.num) num
from dt
join yn on dt.A_EQ_C = 'Y' and yn.indicator = 'yes'  or dt.A_EQ_B = 'Y' and yn.indicator = 'no'
cross join num
order by 5,4,1,2,3;

CodePudding user response:

This seems like a very odd query indeed but, assuming your example query is representative of what you're trying to achieve, you can rewrite it like so:

WITH  a AS (SELECT 1 ID, 1 a, 1 b, 1 c, 10 num FROM dual UNION ALL
            SELECT 1 ID, 1 a, 3 b, 1 c, 20 num FROM dual UNION ALL
            SELECT 2 ID, 2 a, 1 b, 1 c, 30 num FROM dual UNION ALL
            SELECT 2 ID, 2 a, 2 b, 1 c, 40 num FROM dual UNION ALL
            SELECT 3 ID, 2 a, 1 b, 4 c, 50 num FROM dual), -- you wouldn't need this subquery; you already have an "a" table
  dummy AS (SELECT 1 ID FROM dual UNION ALL
            SELECT 2 ID FROM dual UNION ALL
            SELECT 3 ID FROM dual UNION ALL
            SELECT 4 ID FROM dual)
SELECT a.*,
       CASE WHEN d.id IN (1, 2) THEN 'no'
            WHEN d.id IN (3, 4) THEN 'yes'
       END indicator,
       CASE WHEN d.id IN (1, 3) THEN to_char(a.num)
            WHEN d.id IN (2, 4) THEN 'All'
       END NUM,
       d.id AS "D.ID"
FROM   a
       INNER JOIN dummy d ON     (a.a = a.b AND d.id IN (1, 2))
                              OR (a.a = a.c AND d.id IN (3, 4))-- each row in the a dataset matching the filter conditions will be returned twice or four times by this join
WHERE  a.a = a.b
OR     a.a = a.c
ORDER BY a.id, a.a, a.b, a.c, d.id;

        ID          A          B          C        NUM INDICATOR NUM        D.ID
---------- ---------- ---------- ---------- ---------- --------- ---- ----------
         1          1          1          1         10 no        10            1
         1          1          1          1         10 no        ALL           2
         1          1          1          1         10 yes       10            3
         1          1          1          1         10 yes       ALL           4
         1          1          3          1         20 yes       20            3
         1          1          3          1         20 yes       ALL           4
         2          2          2          1         40 no        40            1
         2          2          2          1         40 no        ALL           2

This basically works by doing a partial cross join of your table's rows with a dummy subquery that has 4 rows (you have two conditions that you want to duplicate rows for, so 2x2 = 4, because presumably you could have a row that matches both conditions).

Once the join has taken place, you can see that the rows that match the a=b condition are duplicated with a d.id of 1 and 2, and similarly the a=c matching rows are duplicated with a d.id of 3 and 4. Thus you can use the d.id column in case expressions to decide what output to display.

P.s. I've assumed your num column is of NUMBER datatype, hence the to_char() in the second case expression.

  • Related