I am trying to rewrite a query by removing union all for improving the performance. But not getting clear picture to remove this. Any suggestion would do good. Thanks in Advance.
select a.*,"no" as indicator,a.num from a where a=b
union all
select a.*,"yes" as indicator,a.num from a where a=c
union all
select a.*,"no" as indicator,"All" as num from a where a=b
union all
select a.*,"yes" as indicator,"All" as num from a where a=c ;
CodePudding user response:
First let me state that the best optimization would be if you re-think you approach and change the query to avoid the redundant output. - the 3th and 4th provide no new information that is not contained in the first two selects.
Anyway if you want to optimize - this is your approach:
The original query makes four time full table scan on your table - you want to do only one.
Sample Data
create table tab as
select 1 a, 2 b, 3 c, '1' num from dual union all
select 1 a, 1 b, 0 c, '2' num from dual union all
select 1 a, 0 b, 1 c, '3' num from dual union all
select 1 a, 0 b, 1 c, '4' num from dual union all
select 1 a, 1 b, 1 c, '5' num from dual;
The query leads to the following output
select a.a, a.b, a.c, 'no' as indicator,a.num from tab a where a=b
union all
select a.a, a.b, a.c, 'yes' as indicator,a.num from tab a where a=c
union all
select a.a, a.b, a.c, 'no' as indicator,'All' as num from tab a where a=b
union all
select a.a, a.b, a.c, 'yes' as indicator,'All' as num from tab a where a=c
order by 5,4,1,2,3;
A B C IND NUM
---------- ---------- ---------- --- ---
1 1 0 no 2
1 0 1 yes 3
1 0 1 yes 4
1 1 1 no 5
1 1 1 yes 5
1 1 0 no All
1 1 1 no All
1 0 1 yes All
1 0 1 yes All
1 1 1 yes All
New Query
First scan the table and define the column a_eq_b
and a_eq_c
in case that the conditions a=b
resp. a=c
holds.
Than join to the yn
table containing two rows yes
and no
with a join condition that either suppress the row if no condition is met or result in one or two row join simulation your original union all
.
Finaly introduce your redundancy duplication all rows with the additional All
record joining to the table All
with two rows.
This query makes only one FTS and provides identical result.
with yn as (
select 'yes' indicator from dual union all
select 'no' from dual),
num as (
select 'All' num from dual union all
select null from dual),
dt as
(select
A, B, C, NUM,
case when a=b then 'Y' end a_eq_b,
case when a=c then 'Y' end a_eq_c
from tab)
select dt.A, dt.B, dt.C,
yn.indicator,
nvl(num.num,dt.num) num
from dt
join yn on dt.A_EQ_C = 'Y' and yn.indicator = 'yes' or dt.A_EQ_B = 'Y' and yn.indicator = 'no'
cross join num
order by 5,4,1,2,3;
CodePudding user response:
This seems like a very odd query indeed but, assuming your example query is representative of what you're trying to achieve, you can rewrite it like so:
WITH a AS (SELECT 1 ID, 1 a, 1 b, 1 c, 10 num FROM dual UNION ALL
SELECT 1 ID, 1 a, 3 b, 1 c, 20 num FROM dual UNION ALL
SELECT 2 ID, 2 a, 1 b, 1 c, 30 num FROM dual UNION ALL
SELECT 2 ID, 2 a, 2 b, 1 c, 40 num FROM dual UNION ALL
SELECT 3 ID, 2 a, 1 b, 4 c, 50 num FROM dual), -- you wouldn't need this subquery; you already have an "a" table
dummy AS (SELECT 1 ID FROM dual UNION ALL
SELECT 2 ID FROM dual UNION ALL
SELECT 3 ID FROM dual UNION ALL
SELECT 4 ID FROM dual)
SELECT a.*,
CASE WHEN d.id IN (1, 2) THEN 'no'
WHEN d.id IN (3, 4) THEN 'yes'
END indicator,
CASE WHEN d.id IN (1, 3) THEN to_char(a.num)
WHEN d.id IN (2, 4) THEN 'All'
END NUM,
d.id AS "D.ID"
FROM a
INNER JOIN dummy d ON (a.a = a.b AND d.id IN (1, 2))
OR (a.a = a.c AND d.id IN (3, 4))-- each row in the a dataset matching the filter conditions will be returned twice or four times by this join
WHERE a.a = a.b
OR a.a = a.c
ORDER BY a.id, a.a, a.b, a.c, d.id;
ID A B C NUM INDICATOR NUM D.ID
---------- ---------- ---------- ---------- ---------- --------- ---- ----------
1 1 1 1 10 no 10 1
1 1 1 1 10 no ALL 2
1 1 1 1 10 yes 10 3
1 1 1 1 10 yes ALL 4
1 1 3 1 20 yes 20 3
1 1 3 1 20 yes ALL 4
2 2 2 1 40 no 40 1
2 2 2 1 40 no ALL 2
This basically works by doing a partial cross join of your table's rows with a dummy subquery that has 4 rows (you have two conditions that you want to duplicate rows for, so 2x2 = 4, because presumably you could have a row that matches both conditions).
Once the join has taken place, you can see that the rows that match the a=b
condition are duplicated with a d.id
of 1 and 2, and similarly the a=c
matching rows are duplicated with a d.id
of 3 and 4. Thus you can use the d.id
column in case expressions to decide what output to display.
P.s. I've assumed your num
column is of NUMBER
datatype, hence the to_char()
in the second case expression.