Home > Blockchain >  How to Optimize the Query in SQL Oracle Database
How to Optimize the Query in SQL Oracle Database

Time:03-12

This query is taking 5 hrs to extract 1M rows out of 120M data.

select * from cc.ALC_TR_LS
where INV_TRN_ID > 0 
order by INV_TRN_ID asc ;
fetch first 1000000 rows only;

Note : INV_TRN_ID is a identity column. Need to optimize this query. Please suggest.

CodePudding user response:

The first thing if you observe performance issues is to check the execution plan.

I'll simulate your case with this sample data

create table large_tab as
select rownum id, 
rpad('x',2000,'y') pad
from dual connect by level <= 20000;

The execution plan for the query (your simplified)

select * from large_tab
where id > 0
order by id
fetch first 10000 rows only;

is as follows

----------------------------------------------------------------------------------------------
| Id  | Operation                | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |           | 10000 |     9M|       | 10223   (1)| 00:00:01 |
|*  1 |  VIEW                    |           | 10000 |     9M|       | 10223   (1)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|           | 20000 |    38M|    39M| 10223   (1)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL     | LARGE_TAB | 20000 |    38M|       |  1842   (1)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=10000)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "LARGE_TAB"."ID")<=10000)
   3 - filter("ID">0)

The critical point is in line 2 where you need to sort 1M rows (10K in my case).

Note that you sort rows not IDs, so you need a lot of temp space as shown in the execution plan column TempSpc.

This is where you spend the elapsed time.

Alternative

If the table is rather static you may use two step approach.

First get the highest ID of your first 1M rows, than get all data up to this ID

Get the Upper Bound ID

select id from large_tab
where id > 0
order by id
offset 9999 rows fetch first 1  rows only;

The plan is similar, but as you order only ID's that are compact, the temp space is gone

--------------------------------------------------------------------------------------
| Id  | Operation                | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |           | 20000 |   761K|  1844   (1)| 00:00:01 |
|*  1 |  VIEW                    |           | 20000 |   761K|  1844   (1)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|           | 20000 |    97K|  1844   (1)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL     | LARGE_TAB | 20000 |    97K|  1842   (1)| 00:00:01 |
--------------------------------------------------------------------------------------

Get Data

Knowing the upper bound ID you may query as simple as

select * from large_tab
where id > 0 and id <= 10000   ---<<< your upper bound here

Note that you do not need the ORDER BY at all to get the requested count, so this will be a very afficient full table scan.

Two final remarks

  • to get 1M rows from a table do not try to somehow deploy an index in your access. If you want to optimize you need partitioning schema based in ID.

  • If you do not need all columns, do not use SELECT * because as you saw in the plan, you will need to sort this unneeded data.

CodePudding user response:

fetching 1000000 record and that is with * (All columns) obviously will take time, Steps suggestion :-

  1. Retrieve only the required columns.
  2. Make partitions for the table it will be more faster for more details about partitions visit https://docs.oracle.com/cd/B19306_01/server.102/b14220/partconc.htm#:~:text=Each row in a partitioned,use of the partition key
  3. if you post your data to auto ascending grid in the front end execution time gets shorten in Backend
  • Related