Home > database >  The union performance
The union performance

Time:09-22

20 tables, each table has a policy of each table data is not the same, there is a small, big May 5 or 6, tens of millions of dollars if fast all the table's policy to generate a table, to performance

CodePudding user response:

The union different tables? Performance is not directed against table/view for these objects, the performance is in view of the application, also is in SQL terms, can say only know how to access the object's case, in order to do performance tuning

CodePudding user response:

Not simply the union, the unrealistic,

First, second, to achieve your demand is the performance adjustment,

Source data looks like, your results and want to be what kind, all want to clear,

CodePudding user response:

I need is the number 20 tables inside all the policy all generate a temporary table, because of worry about missing data, so all the table to get by,
Simple statement is

The create table TMP
As
Select the policy
The from (select policy from tab1
Union select policy from tab2
.
.
Union select policy from tab20
TMP)

Worry about performance problems, hope to have a good solution,

CodePudding user response:

If only a single field generate a temporary table that there is no problem,
But you said that the performance of the performance of the what is? Performance is to generate a temporary table, or after a temporary table is the performance of the query?

CodePudding user response:

1, is the performance of the SQL itself, such pressure against database,
20 table 2, is this really the repeat rate is very high, are likely to have three tables contain all the policy, more hope is to find a better way to achieve

CodePudding user response:

reference 5 floor sbymdh2003 reply:
1, is the performance of the SQL itself, such pressure against database,
20 table 2, is this really the repeat rate is very high, are likely to have three tables contain all the policy, more hope is to find a better way to achieve


What is SQL SQL itself?

CodePudding user response:

If we can make sure more than the policy number in the table is not repeated, with the union all don't use the union
Because the union internal needs sorting, 20, 50 million tables together sort were

 
The create table TMP
As
Select the policy from tab1
Union all select policy from tab2
.
.
Union all select policy from tab20


If you want to repeat advice built TMP table first, and give TMP table this field to establish a unique index
Then insert the table data in turn
 
Insert into TMP select policy from tab1 where not the exists (select 1 from TMP where TMP. The policy=tab1. The insurance policy);
Insert into TMP select policy from tab2 where not the exists (select 1 from TMP where TMP. The policy=tab2 policy);
Insert into TMP select policy from tab3 where not the exists (select 1 from TMP where TMP. The policy=tab3. The insurance policy);
.
Insert into TMP select policy from tab20 where not the exists (select 1 from TMP where TMP. The policy=tab20. The insurance policy);


Also can try the insert/* + APPEND */into...

CodePudding user response:

Over 20 times the not exists

CodePudding user response:

refer to the eighth floor sbymdh2003 response:
over 20 times the not exists

Has a unique index is also good, than you 20 form combined billions much better data sorting,
Your union, sorting the fastest complexity is O (nlogn) here n=1 e10
Has a unique index case, as you say, 3 tables are all basic, 17 tables are not exists behind the operation, the equivalent of a full table scan first and then on a dichotomy to find complexity is O (logn) 17 here n=6 e8

CodePudding user response:

1. The union involved sort operations, very affect performance, so try to use union all instead of the union,
There are three ways 2. Insert
(1) if the 20 form the amount of data is very less, you can use:
 
The create table TMP
Select distinct policy no.
The from (
Select the policy number from tab1
Union all
Select the policy number from tab2
Union all
.
Select the policy number from tab20
)

(2) the table data quantity is large, it is suggested that separate insert:
 
The create table TMP as select insurance policy number from tab1;

Insert into TMP values (policy) select the policy number from tab2.
commit;
Insert into TMP values (policy) select the policy number from tab3;
commit;
.
Insert into TMP values (policy) select the policy number from tab20;
commit;

Finally remember the report number of TMP table to heavy processing,
(3) you can also use the stored procedure, a dynamic query contains policy number of tables, fields, using dynamic SQL insert after splicing,

CodePudding user response:

Why I think this problem is a little more and cant sign...

The original poster is released from the start of their guesses, no actual describe their needs and problems, can you detailed again original requirements and implementation of "the union" after encountered problems?

CodePudding user response:

Should the original poster is actually how to efficient processing 20 tables to weight problems, the original poster wants to use the union, want to ask is there any better way, are you exaggeration!
Involved to heavy, it is difficult to achieve the building Lord think efficient
If 20 tables, data duplication is very high, I think can directly form the union all the way down to 20
If the duplication is not high, all first union all then distinct
Build table method unless it is a large amount of data, using the time in memory space

CodePudding user response:

refer to 12 floor kingkingzhu reply:
should the original poster is actually how to efficient processing 20 tables to weight problems, the original poster wants to use the union, want to ask is there any better way, are you exaggeration!
Involved to heavy, it is difficult to achieve the building Lord think efficient
If 20 tables, data duplication is very high, I think can directly form the union all the way down to 20
If the duplication is not high, all first union all then distinct
Build table method unless it is a large amount of data, using the time in memory space

The original poster said, big table 560 million, small table at least also in the millions

CodePudding user response:

The union should be the best possible way

CodePudding user response:

Go to the heavy uniall again go to

CodePudding user response:

If you want to put the data inserted into a temporary table, need not UNION, create an empty table first, and then to perform INSERT into each table, INSERT time can use the/* + APPEND */, behind increase NOLOGGING banned logging, SELECT the back with/* + PARALLEL */increase in PARALLEL, can do in the evening, of course, enough attention should be paid to UNDO tablespace
  • Related