Home > other >  Slow SQL query to get values from xml where some value is in xml
Slow SQL query to get values from xml where some value is in xml

Time:01-01

I want to make my SQL execution faster.

I have the table with a XML column. XML stored is in this format:

<r eid="bdcd3ba2-e370-4140-8c46-01aed584bab2">
    <name>
        <customValue>Aperol</customValue>
    </name>
    <description></description>
    <num>46310660710877889975</num>
    <outerEconomicActivityNomenclatureCode>ccbdd540-3bb4-4a9f-a8d3-5a31b9459846</outerEconomicActivityNomenclatureCode>
    <type>DISH</type>
</r>

I want to get <outerEconomicActivityNomenclatureCode> value from XML row, where <type> in XML is DISH and <num> is in ('46310660710877889975', '46310660710877889976', '46310660710877889975', '46310660710877889975')

My SQL queries:

this takes 1 minute and 44 seconds

SELECT
    m.c.value('(name/customValue/text())[1]', 'varchar(50)') AS dish
FROM 
    _fabbrica.dbo.entity dish
CROSS APPLY 
    (SELECT CAST(dish.xml AS xml) AS realxml) s
CROSS APPLY
    s.realxml.nodes('r') m(c)
WHERE
    dish.type = 'Product' 
    AND m.c.value('(type/text())[1]', 'varchar(50)') = 'DISH' 
    AND m.c.value('(num/text())[1]', 'varchar(50)') IN ('94948475659909879621', 'code', 'code')

This query takes 2 minutes

SELECT 
    dish.id, 
    SUBSTRING(dish.xml, (CHARINDEX('<customValue>', dish.xml)   13), (CHARINDEX('</customValue>', dish.xml) - CHARINDEX('<customValue>', dish.xml) - 13)) AS 'Dish', 
    SUBSTRING(outerEanCode.xml, (CHARINDEX('<outerEanCode>', outerEanCode.xml)   14), (CHARINDEX('</outerEanCode>', outerEanCode.xml) - CHARINDEX('<outerEanCode>', outerEanCode.xml) - 14)) AS 'OuterEanCode'
FROM 
    [_fabbrica].[dbo].[entity] dish 
JOIN 
    [_fabbrica].[dbo].[entity] outerEanCode ON outerEanCode.id = CASE WHEN CHARINDEX( '<outerEconomicActivityNomenclatureCode>', dish.xml ) > 0 THEN SUBSTRING( dish.xml, ( CHARINDEX( '<outerEconomicActivityNomenclatureCode>', dish.xml )   39 ), ( CHARINDEX( '</outerEconomicActivityNomenclatureCode>', dish.xml ) - CHARINDEX( '<outerEconomicActivityNomenclatureCode>', dish.xml ) - 39 ) ) ELSE '61D63FE7-212C-4847-BA32-1563D97E2424' END
WHERE 
    dish.type = 'Product' 
    AND SUBSTRING(dish.xml, (CHARINDEX('<num>', dish.xml)   5), (CHARINDEX('</num>', dish.xml) - CHARINDEX('<num>', dish.xml) - 5)) IN ('94948475659909879621')

Are there some ways to change or create another query to reduce the execution time?

CodePudding user response:

The m.c.value('(num/text())[1]', 'varchar(50)') in (...) list seems to be optimised badly (enter image description here

One way to avoid this would be to just evaluate m.c.value('(num/text())[1]', 'varchar(50)') once per document and materializing the result to a temp table or table variable and then querying that with an IN.

But the below seems to solve the issue without that

SELECT m.c.value('(name/customValue/text())[1]', 'varchar(50)') as dish
FROM _fabbrica.dbo.entity dish
CROSS APPLY (SELECT CAST(dish.xml as xml) as realxml) s
CROSS APPLY s.realxml.nodes('r[type = "DISH"][num =(
    "46310660710877889975","94948475659909879621","94948475659909879704","94948474748856478602",
    "46310660710877881223","94948475659909879363","94948474748847478248","26310360710077875944",
    "94948475659909879207","94948475659909879700","94948475659909879650","94948474748856478612",
    "94948474748856478618","94948475659909879704","94948474748856478496","46310660710877896883",
    "46310660710877881223","94948475659909879363","94948474747747475446","94948475659909879366",
    "94948475659909879121","26310360710077875994","94948475659909877377","94948475659909879123",
    "94948475659909879124","94948475659909879122","94948474748847478246","94948475659909879549",
    "94948474748856478594","94948474748847478244","94948475659909879597","94948475659909879363",
    "94948474747747475446","94948475659909879366","94948475659909879121",
    "94948475659909879123","94948475659909879122","46310660710877890446"
)]') m(c)
WHERE dish.type = 'Product'
  • Related