I have a SQL query that consists of multiple left joins to the same table and has a very bad performance. The only difference is that each left join searches for a different feature in the table, which is then set as a new column.
SELECT
t0.*,
m1.value AS "Bestellnummer"
m2.value AS "Bestellposition",
m3.value AS "Aussteller",
m4.value AS "Baugruppe",
m5.value AS "Artikelnummer",
m6.value AS "Zertifikatsnummer",
m7.value AS "Arbeitspaket",
m8.value AS "Chargennummer",
m9.value AS "Systemnummer",
m10.value AS "Armaturnummer"
FROM table_1 t0
LEFT JOIN table_2 m1 ON (m1.dokument_id = t0.id AND m1.merkmal = 'Bestellnummer')
LEFT JOIN table_2 m2 ON (m2.dokument_id = t0.id AND m2.merkmal = 'Bestellposition')
LEFT JOIN table_2 m3 ON (m3.dokument_id = t0.id AND m3.merkmal = 'Aussteller')
LEFT JOIN table_2 m4 ON (m4.dokument_id = t0.id AND m4.merkmal = 'Baugruppe')
LEFT JOIN table_2 m5 ON (m5.dokument_id = t0.id AND m5.merkmal = 'Artikelnummer')
LEFT JOIN table_2 m6 ON (m6.dokument_id = t0.id AND m6.merkmal = 'Zertifikatsnummer')
LEFT JOIN table_2 m7 ON (m7.dokument_id = t0.id AND m7.merkmal = 'Arbeitspaket')
LEFT JOIN table_2 m8 ON (m8.dokument_id = t0.id AND m8.merkmal = 'Chargennummer')
LEFT JOIN table_2 m9 ON (m9.dokument_id = t0.id AND m9.merkmal = 'Systemnummer')
LEFT JOIN table_2 m10 ON (m10.dokument_id = t0.id AND m10.merkmal = 'Armaturnummer')
I have already tried to replace all the left joins with the following statement:
SELECT
t0.id,
t0.document_id,
t0.vault_id,
t0.document_type_id,
t0.state,
t0.name,
to_number(to_char(unixts_to_date(t0.file_create_date), 'YYYYMMDD')) AS file_create_date,
to_number(to_char(unixts_to_date(t0.file_change_date), 'YYYYMMDD')) AS file_change_date,
t1."Bestellnummer",
t1."Aussteller",
t1."Baugruppe",
t1."Artikelnummer",
t1."Zertifikatsnummer",
t1."Arbeitspaket",
t1."Chargennummer",
t1."Systemnummer",
t1."Armaturnummer"
FROM table_1 t0
LEFT JOIN
(
SELECT
dokument_id,
max(case when merkmal = 'Bestellnummer' then value else null end ) as "Bestellnummer",
max(case when merkmal = 'Bestellposition' then value else null end ) as "Bestellposition",
max(case when merkmal = 'Aussteller' then value else null end ) as "Aussteller",
max(case when merkmal = 'Baugruppe' then value else null end ) as "Baugruppe",
max(case when merkmal = 'Artikelnummer' then value else null end ) as "Artikelnummer",
max(case when merkmal = 'Zertifikatsnummer' then value else null end ) as "Zertifikatsnummer",
max(case when merkmal = 'Arbeitspaket' then value else null end ) as "Arbeitspaket",
max(case when merkmal = 'Chargennummer' then value else null end ) as "Chargennummer",
max(case when merkmal = 'Systemnummer' then value else null end ) as "Systemnummer",
max(case when merkmal = 'Armaturnummer' then value else null end ) as "Armaturnummer"
FROM
table_2
GROUP BY dokument_id
) t1 on t1.dokument_id = t0.id
But this does not work, since a document of the table tabe_1
for example can have 2 times the characteristic 'article number' with different values.
Does anyone have an idea?
Thank you!
CodePudding user response:
Ouch. It looks like someone in your organization has unfortunately "discovered" the Entity-Attribute-Value data model, which promises flexibility but is hardly scalable at all. I don't know if there is an easy trick, and you did not provide useful details like an explain plan output.
Your problem is made worse by the fact that you have no WHERE
condition on table_1
. So, guessing here, you are probably looking at a HASH_JOIN
operation using table_1
as the "build table"[1] and then 9 loops using table_2
as the "probe table". Worse still, it's likely that the build table from table_1
does not fit in PGA memory, making the whole process way less efficient.
First, you should have indexes on table_1(id)
and table_2(dokument_id, merkmal, value)
. Including value
in the table_2
index can be very helpful in a situation like this, because it means Oracle never has to read the blocks from table_2
-- everything it needs from that table is already in the index (this is called a "covering index"). That index is likely to be smaller than the table.
Second, if that doesn't help, consider restructuring table_1
as an Index-Organized Table. That, along with the covering index on table_2
might make a sort-merge join viable.
No guarantees here. You have a bad data model here for the query you are trying to execute. Clever SQL can only go so far. The real answer is to fix your data model. Move each the attribute value from table_2
into a new column on table_1
.
P.S. If you do have a WHERE
clause on table_1
and just omitted it from your question for some reason, you should update your question because it changes the answer a lot.
P.P.S. If you are still stuck, update your question with DBMS_XPLAN
output with statistics.
CodePudding user response:
Have you considered making a temp table and then doing updates to it utilizing the conditions in your left joins on each update statement, only as inner joins?