Home > Back-end >  TSQL - Parse XML metadata & values dynamically
TSQL - Parse XML metadata & values dynamically

Time:11-24

Background I have an XML column in my SQL table (using SQL Server). Each node has a different amount of metadata. For example, in the below example, Step Number 1 has the only "No" as metadata while, Step Number 2 additionally has RBuffer.

<Step No="1" >Step Number 1</Step>
<Step No="2" RBuffer="6000">Step Number 2</Step>
<Step No="3" Macro="5">Step Number 3</Step>

Expected Output

I'd like to extract this metadata dynamically while also grabbing the value. For the example above, this would look like the below table. Importantly, it shouldn't matter how many metadata tags there are, I want it to go through all of them. Some of my data has 10 tags.

Node Step Key Value
Step 1 Value Step Number 1
Step 2 RBuffer 6000
Step 2 Value Step Number 2
Step 3 Macro 5
Step 3 Value Step Number 3

Work so far

So far, I've been able to extract the metadata in a static way:

SELECT o.value('@No', 'varchar(32)') [Step]
      ,o.value('@Macro', 'varchar(32)') [Macro]
      ,o.value('@RBuffer', 'varchar(32)') [RBuffer]
      ,o.value('(text())[1]', 'varchar(32)') [Action]
  FROM [dbo].[dw_mrd_vss_rundetail_stg] S
    CROSS APPLY S.[rundata_detail].nodes('Step') xmlData(o)

Which gives the following table:

Step Macro RBuffer Action
1 NULL NULL Step Number 1
2 NULL 6000 Step Number 2
3 5 NULL Step Number 3

But I have to explicitly call each value and creating columns in this way isn't scalable. Any help would be appreciated. I am relatively new to this kind of data munging in SQL, so explanations of code would be helpful.

CodePudding user response:

A dynamic solution. If the "No" attribute is optional too and a node name is varying as well,

Declare @xml Xml = '<doc>
  <Step No="1" >Step Number 1</Step>
  <Step No="2" RBuffer="6000">Step Number 2</Step>
  <Step No="3" Macro="5">Step Number 3</Step>
  <Step Macro="7">Step Number 4</Step>
  <Node No="5">Step Number 5</Node>
</doc>';

select x.*
from @xml.nodes('/doc/*') d(dn)
cross apply (
  -- element data and "No" attr 
  select n.value('local-name(.)', 'varchar(32)') [node], 'Value' [Key], n.value('@No', 'varchar(32)') [Step], n.value('(text())[1]', 'varchar(32)') [Value]
  from d.dn.nodes('.') s(n)
  union all
  -- attributes data but "No"
  select n.value('local-name(../.)', 'varchar(32)') [node], n.value('local-name(.)', 'varchar(32)') [Key], n.value('../@No', 'varchar(32)') [Step], n.value ('data(.)', 'varchar(32)') [Value]
  from d.dn.nodes('./@*[local-name(.)!="No"]') a(n)
) x

Returns

node    Key Step    Value
Step    Value   1   Step Number 1
Step    Value   2   Step Number 2
Step    RBuffer 2   6000
Step    Value   3   Step Number 3
Step    Macro   3   5
Step    Value       Step Number 4
Step    Macro       7
Node    Value   5   Step Number 5

CodePudding user response:

You can CROSS APPLY a sequence containing the attributes and the inner text. Then for each of those, you can use local-name(.) to get the name of an attribute.

SELECT
  Node  = x1.step.value('local-name(.)[1]','varchar(20)'),
  Step  = x1.step.value('@No','int'),
  [Key] = x2.vals.value('if (local-name(.) = "") then "Value" else local-name(.)','varchar(20)'),
  Value = x2.vals.value('.','nvarchar(100)')
FROM dw_mrd_vss_rundetail_stg s
CROSS APPLY s.rundata_detail.nodes('/Step') x1(step)
CROSS APPLY x1.step.nodes('(./@*[local-name(.) != "No"], ./text())') x2(vals);

db<>fiddle

If you want to include all nodes, even ones that are not Step, just change the first .nodes to .nodes('/*')

  • Related