I have query something like
WITH
str_table as (
SELECT stringtext, stringnumberid
FROM STRING_TABLE
WHERE LANGID IN (23,62)
),
data as (
select *
from employee emp
left outer join str_table st on emp.nameid = st.stringnumberid
)
select * from data
I know With clause will work in this manner
Step 1 : The SQL Query within the with clause is executed at first step.
Step 2 : The output of the SQL query is stored into temporary relation of with clause.
Step 3 : The Main query is executed with temporary relation produced at the last stage.
Now I want to ask whether the indexes created on the actual STRING_TABLE
are going to help in temporary str_table
produce by the With clause? I want to ask whether the indexes also have impact on str_table or not?
CodePudding user response:
Oracle will not process CTE one by one. It will analyze the SQL as a whole. Your SQL is most likely the same as following in the eye of Oracle optimizer
select emp.*
from employee emp left outer join STRING_TABLE st
on emp.nameid = st.stringnumberid
where st.LANGID IN (23,62);
Oracle can use index on STRING_TABLE
. Whether it will depends on the table statistics. For example, if the table has few rows (say a few hundred), Oracle will likely not use index.
CodePudding user response:
It depends.
First of all, with
clause is not a temporary table. As documentation says:
Oracle Database optimizes the query by treating the query name as either an inline view or as a temporary table.
Optimizer decides to materialize with
subquery if either you forse it to do so by using /* materialize*/
hint inside the subquery or you reuse this with
subquery more than once.
In the example below Oracle uses with
clause as inline view and merges it within the main query:
explain plan for with a as ( select s.textid, s.textvalue, a.id, a.other_column from string_table s join another_tab a on s.textid = a.textid where langid in (1) ) select * from big_table b join a a_name on b.name_textid = a_name.textid and b.job_textid = a_name.id
| PLAN_TABLE_OUTPUT | | :----------------------------------------------------------------------------------- | | Plan hash value: 1854049435 | | | | ------------------------------------------------------------------------------------ | | | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | | | ------------------------------------------------------------------------------------ | | | 0 | SELECT STATEMENT | | 1 | 1147 | 74 (0)| 00:00:01 | | | |* 1 | HASH JOIN | | 1 | 1147 | 74 (0)| 00:00:01 | | | | 2 | TABLE ACCESS FULL | ANOTHER_TAB | 39 | 3042 | 3 (0)| 00:00:01 | | | |* 3 | HASH JOIN | | 31 | 33139 | 71 (0)| 00:00:01 | | | | 4 | TABLE ACCESS FULL| BIG_TABLE | 19 | 10279 | 3 (0)| 00:00:01 | | | |* 5 | TABLE ACCESS FULL| STRING_TABLE | 1143 | 589K| 68 (0)| 00:00:01 | | | ------------------------------------------------------------------------------------ |
But depending on the statistics and hints it may evaluate subquery first and then add it to the main query:
explain plan for with a as ( select s.textid, s.textvalue, a.id, a.other_column from string_table s join another_tab a on s.textid = a.textid where langid in (1) ) select /* NO_MERGE(a_name)*/ * from big_table b join a a_name on b.name_textid = a_name.textid and b.job_textid = a_name.id
| PLAN_TABLE_OUTPUT | | :------------------------------------------------------------------------------------ | | Plan hash value: 4105667421 | | | | ------------------------------------------------------------------------------------- | | | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | | | ------------------------------------------------------------------------------------- | | | 0 | SELECT STATEMENT | | 101 | 110K| 74 (0)| 00:00:01 | | | |* 1 | HASH JOIN | | 101 | 110K| 74 (0)| 00:00:01 | | | | 2 | TABLE ACCESS FULL | BIG_TABLE | 19 | 10279 | 3 (0)| 00:00:01 | | | | 3 | VIEW | | 64 | 37120 | 71 (0)| 00:00:01 | | | |* 4 | HASH JOIN | | 64 | 38784 | 71 (0)| 00:00:01 | | | | 5 | TABLE ACCESS FULL| ANOTHER_TAB | 39 | 3042 | 3 (0)| 00:00:01 | | | |* 6 | TABLE ACCESS FULL| STRING_TABLE | 1143 | 589K| 68 (0)| 00:00:01 | | | ------------------------------------------------------------------------------------- |
When you use with
subquery twice, optimizer decides to materialize it:
explain plan for with a as ( select s.textid, s.textvalue from string_table s where langid in (1) ) select * from big_table b join a a_name on b.name_textid = a_name.textid join a a_job on b.job_textid = a_job.textid
| PLAN_TABLE_OUTPUT | | :--------------------------------------------------------------------------------------------------------------------- | | Plan hash value: 1371454296 | | | | ---------------------------------------------------------------------------------------------------------------------- | | | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | | | ---------------------------------------------------------------------------------------------------------------------- | | | 0 | SELECT STATEMENT | | 63 | 98973 | 67 (0)| 00:00:01 | | | | 1 | TEMP TABLE TRANSFORMATION | | | | | | | | | 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D7224_469C01 | | | | | | | | 3 | TABLE ACCESS BY INDEX ROWID BATCHED | STRING_TABLE | 999 | 515K| 22 (0)| 00:00:01 | | | |* 4 | INDEX RANGE SCAN | IX | 999 | | 4 (0)| 00:00:01 | | | |* 5 | HASH JOIN | | 63 | 98973 | 45 (0)| 00:00:01 | | | |* 6 | HASH JOIN | | 35 | 36960 | 24 (0)| 00:00:01 | | | | 7 | TABLE ACCESS FULL | BIG_TABLE | 19 | 10279 | 3 (0)| 00:00:01 | | | | 8 | VIEW | | 999 | 502K| 21 (0)| 00:00:01 | | | | 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D7224_469C01 | 999 | 502K| 21 (0)| 00:00:01 | | | | 10 | VIEW | | 999 | 502K| 21 (0)| 00:00:01 | | | | 11 | TABLE ACCESS FULL | SYS_TEMP_0FD9D7224_469C01 | 999 | 502K| 21 (0)| 00:00:01 | | | ---------------------------------------------------------------------------------------------------------------------- |
So when there are some indexes on tables inside with
subquery they may be used in all above cases: before materialization, when subquery is not merged and when subquery is merged and some idexes provide better query plan on merged subquery (even when those indexes are not used when you execute subquery alone).
What about idexes: if they provide high selectivity (i.e. number of rows retrieved by index is small compared to the overall number of rows), then Oracle will consider to use it. Note, that index access has two steps: read index blocks and then read table blocks that contain rowid
s found by index. If table size is not much bigger than index size, then Oracle may use table scan instead of index scan even for quite selective predicate (because of doubled IO).
In the below example I've used "small" texts (100 chars) and big_table
table of 20 rows and this index for text table:
create index ix
on string_table(langid, textid)
Optimizer decides to use index range scan
and read only blocks of the first level (first column of the index):
explain plan for with a as ( select s.textid, s.textvalue from string_table s where langid in (1) ) select * from big_table b join a a_name on b.name_textid = a_name.textid
| PLAN_TABLE_OUTPUT | | :---------------------------------------------------------------------------------------------------- | | Plan hash value: 1660330381 | | | | ----------------------------------------------------------------------------------------------------- | | | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | | | ----------------------------------------------------------------------------------------------------- | | | 0 | SELECT STATEMENT | | 29 | 31001 | 26 (0)| 00:00:01 | | | |* 1 | HASH JOIN | | 29 | 31001 | 26 (0)| 00:00:01 | | | | 2 | TABLE ACCESS FULL | BIG_TABLE | 19 | 10279 | 3 (0)| 00:00:01 | | | | 3 | TABLE ACCESS BY INDEX ROWID BATCHED| STRING_TABLE | 999 | 515K| 23 (0)| 00:00:01 | | | |* 4 | INDEX RANGE SCAN | IX | 999 | | 4 (0)| 00:00:01 | | | ----------------------------------------------------------------------------------------------------- | | | | Predicate Information (identified by operation id): | | --------------------------------------------------- | | | | 1 - access("B"."NAME_TEXTID"="S"."TEXTID") | | 4 - access("LANGID"=1) | |
But when we reduce the number of rows in big_table
, it uses both the columns for index scan:
delete from big_table where id > 4 explain plan for with a as ( select s.textid, s.textvalue from string_table s where langid in (1) ) select * from big_table b join a a_name on b.name_textid = a_name.textid
| PLAN_TABLE_OUTPUT | | :-------------------------------------------------------------------------------------------- | | Plan hash value: 1766926914 | | | | --------------------------------------------------------------------------------------------- | | | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | | | --------------------------------------------------------------------------------------------- | | | 0 | SELECT STATEMENT | | 6 | 18216 | 11 (0)| 00:00:01 | | | | 1 | NESTED LOOPS | | 6 | 18216 | 11 (0)| 00:00:01 | | | | 2 | NESTED LOOPS | | 6 | 18216 | 11 (0)| 00:00:01 | | | | 3 | TABLE ACCESS FULL | BIG_TABLE | 4 | 4032 | 3 (0)| 00:00:01 | | | |* 4 | INDEX RANGE SCAN | IX | 1 | | 1 (0)| 00:00:01 | | | | 5 | TABLE ACCESS BY INDEX ROWID| STRING_TABLE | 2 | 4056 | 2 (0)| 00:00:01 | | | --------------------------------------------------------------------------------------------- | | | | Predicate Information (identified by operation id): | | --------------------------------------------------- | | | | 4 - access("LANGID"=1 AND "B"."NAME_TEXTID"="S"."TEXTID") | | |
You may check above code snippets in the db<>fiddle.