Home > Software design >  TSQL XML - Node attributes as columns or rows in same query
TSQL XML - Node attributes as columns or rows in same query

Time:12-07

I'm working with an XML file with a node similar to this:

<Process>    
    <Step No="1" Types="D" Temp="25" Secs="6" Macro="2">Enable Mixers</Step>
    <Step No="11" Types="D1" Temp="11" Secs="61" Macro="21">Enable Mixers1</Step>
    <Step No="2" Types="D2" Temp="22" Secs="62" Macro="23">Enable Mixers3</Step>
</Process>

DDL:

DROP TABLE IF EXISTS MyXML2
GO
CREATE TABLE MyXML2(ID INT IDENTITY(1,1), c XML)
INSERT MyXML2(c) VALUES 
    ('<Process><Step No="1" Types="D1" Temp="11" Secs="61" Macro="21">Enable Mixers1</Step>
     <Step No="11" Types="D1" Temp="11" Secs="61" Macro="21">Enable Mixers1</Step>
     <Step No="2" Types="D2" Temp="22" Secs="62" Macro="23">Enable Mixers3</Step></Process>')
GO

I need to get a database structure like this (example given for only 1 "Step" above):

StepNumber ColumnName ColumnValue
1 Types D
1 Temp 25
1 Secs 6
1 Macro 2

My work so far: I've been able to map each attribute into a row.(Edit: updated to working example based on DDL above)

SELECT 
    col.value('local-name(.)', 'VARCHAR(50)') AS ColumnName,
    col.value('.[1]', 'VARCHAR(MAX)') AS ColumnValue
FROM [MyXML2]
CROSS APPLY [c].nodes('/Process/Step/@*') doc(col)

The output looks like:

But I need the "No" attribute to be a column. Is there a way to do this all in one query?

Fiddle: enter image description here


Update: 2021-12-06

Following the new information which we got, here are some new solutions and explanation. The above should be useful for future readers which have similar question.

So, in the above solutions I focused on a case where we have single Step node in each row in the table. According the new information we might have multiples nodes of Step in the same value. Moreover, the Step nodes are wrapped in another node name Process

For example, a specific XML value can be: <Process><Step No="1" Types="D1" Temp="1" Secs="61" Macro="21">Enable Mixers1</Step> <Step No="11" Types="D11" Temp="11" Secs="611" Macro="21">Enable Mixers2</Step> <Step No="111" Types="D111" Temp="111" Secs="6111" Macro="23">Enable Mixers3</Step></Process>

Demo three: using variable, Step nodes structure is unknown, multiple Step nodes

In this demo I will resent solution based on the same approach as solution one

declare @xml XML = '<Process><Step No="1" Types="D1" Temp="11" Secs="61" Macro="21">Enable Mixers1</Step>
     <Step No="11" Types="D1" Temp="11" Secs="61" Macro="21">Enable Mixers2</Step>
     <Step No="2" Types="D2" Temp="22" Secs="62" Macro="23">Enable Mixers3</Step></Process>'

-->>> HIGHLY recommended to un-comment below lines and check what I am using as input for the CTE in this solution
--SELECT
--  t.c.value('./@No', 'VARCHAR(128)') as StepNumber,
--  t.c.query ('.') as Types
--from @xml.nodes('Process/.[1]/*')as t(c)
    
;With MyCTE01 as (
    SELECT
        t.c.value('./@No', 'INT') as StepNumber,
        t.c.query ('.') as MyXML
    from @xml.nodes('Process/.[1]/*')as t(c)
)
SELECT 
    MyCTE01.StepNumber,
    doc.Col.value('local-name(.[1])','VARCHAR(100)') ColumnName,
    doc.Col.value('.[1]','VARCHAR(100)') ColumnValue
FROM MyCTE01
CROSS APPLY MyCTE01.MyXML.nodes('/Step/@*') doc(Col)
WHERE not doc.Col.value('local-name(.[1])','VARCHAR(100)') = 'No'
GO

This solution will work for you but if the structure of the Step node is always the same - meaning you have the same attributes as in al the examples during the discussion, then we can get much much better solutions...

Demo four: Using variable, Step nodes has a known structure, multiple Step nodes

Since we know which attribute we have then we can hard coded use the names. In this case we do not this part which meant to find all the attributes CROSS APPLY MyCTE01.MyXML.nodes('/Step/@*')

We can use totally different approach, directly getting the values of the know attributes and using UNPIVOT. This solution provide much better performance but it is less flexible then solutions three.

declare @xml XML = '<Process><Step No="1" Types="D1" Temp="11" Secs="61" Macro="21">Enable Mixers1</Step>
     <Step No="11" Types="D1" Temp="11" Secs="61" Macro="21">Enable Mixers2</Step>
     <Step No="2" Types="D2" Temp="22" Secs="62" Macro="23">Enable Mixers3</Step></Process>'

--select 
--  t.c.value('./@No', 'VARCHAR(128)') as id,
--  t.c.value('./@Types', 'VARCHAR(128)') as Types,
--  t.c.value('./@Temp', 'VARCHAR(128)') as Temp,
--  t.c.value('./@Secs', 'VARCHAR(128)') as Secs,
--  t.c.value('./@Macro', 'VARCHAR(128)') as Macro,
--  t.c.value('./@Macro', 'VARCHAR(128)') as Macro
--from @xml.nodes('Process/.[1]/*')as t(c)

SELECT StepNumber, Column_Name, Column_Value
FROM(
    select 
        t.c.value('./@No', 'VARCHAR(128)')    as StepNumber,
        t.c.value('./@Types', 'VARCHAR(128)') as Types,
        t.c.value('./@Temp', 'VARCHAR(128)')  as Temp,
        t.c.value('./@Secs', 'VARCHAR(128)')  as Secs,
        t.c.value('./@Macro', 'VARCHAR(128)') as Macro
    from @xml.nodes('Process/.[1]/*')as t(c)
   ) p  
UNPIVOT  
   (Column_Value FOR Column_Name IN (Types, Temp, Secs, Macro)  )AS unpvt;  
GO

Note! You can use this approach for unknown structure as well if you use dynamic queries and first find the attributes in the XML.

Demo five: Using variable, Step nodes has a known structure, multiple Step nodes

This solution has the same limitation as solution four (known structure) but in addition it only fits when we are working on single value like a variable. Therefore, if we want to implement it on table then we might need to loop all rows which might reduce performance dramatically. But when this solution fits the needs then it should provide best performance!

/***BEST SOLUTION - if fits the needs***/
-- XML to Tabular using OPENXML
DECLARE @idoc INT, @xml XML = '<Process><Step No="1" Types="D1" Temp="1" Secs="61" Macro="21">Enable Mixers1</Step>
     <Step No="11" Types="D11" Temp="11" Secs="611" Macro="21">Enable Mixers2</Step>
     <Step No="111" Types="D111" Temp="111" Secs="6111" Macro="23">Enable Mixers3</Step></Process>'
--Create an internal representation of the XML document.  
-- Reads the XML text -> parses the text by using the MSXML parser -> and provides the parsed document in a state ready for consumption. 
EXEC sp_xml_preparedocument @idoc OUTPUT, @xml;

--SELECT 
--  No    as StepNumber,
--  Types as Types,
--  Temp  as Temp,
--  Secs  as Secs,
--  Macro as Macro,
--  NoteValue
--FROM OPENXML (@idoc, '/Process/Step')  
--  WITH (
--      -- When OPENXML does not have input of third parameter then we can choose if this will atribute or node
--      -- usig '@No' will bring the value of atribute and using 'No' will bring the value of node
--      No        INT          '@No'   ,
--      Types     VARCHAR(128) '@Types',  
--      Temp      VARCHAR(128) '@Temp' ,  
--      Secs      VARCHAR(128) '@Secs' ,  
--      Macro     VARCHAR(128) '@Macro', 
--      NoteValue VARCHAR(128) '.'
--  ) 

SELECT StepNumber, Column_Name, Column_Value
FROM(
    SELECT 
    No    as StepNumber,
    Types as Types,
    Temp  as Temp,
    Secs  as Secs,
    Macro as Macro
    FROM OPENXML (@idoc, '/Process/Step',1)  
        WITH (
            No    INT,
            Types VARCHAR(128),  
            Temp  VARCHAR(128),  
            Secs  VARCHAR(128),  
            Macro VARCHAR(128)
        ) 
   ) p  
UNPIVOT  
   (Column_Value FOR Column_Name IN (Types, Temp, Secs, Macro)  )AS unpvt;  
--sp_xml_removedocument free's up the memory.  
EXEC sp_xml_removedocument @idoc   
GO

So... we have multiple approaches whic fits different case... but we still need to think about tables...

Demo six: Using table, Step nodes has unknown structure, multiple Step nodes

You can implement Demo four if this fit (known structure or using dynamic query), but for the last demo I will implement Demo three approach on a case that we have multiple rows in a table which each row includes XML that has multiple Step nodes

DROP TABLE IF EXISTS MyXML_Tbl
GO
CREATE TABLE MyXML_Tbl(ID INT IDENTITY(1,1), MyXML XML)
GO
INSERT MyXML_Tbl(MyXML) VALUES 
    ('<Process><Step No="1" Types="D1" Temp="1" Secs="61" Macro="21">Enable Mixers1</Step>
     <Step No="11" Types="D11" Temp="11" Secs="611" Macro="21">Enable Mixers1</Step>
     <Step No="111" Types="D111" Temp="111" Secs="6111" Macro="23">Enable Mixers3</Step></Process>')
INSERT MyXML_Tbl(MyXML) VALUES 
    ('<Process><Step No="2" Types="D1" Temp="11" Secs="61" Macro="21">Enable Mixers1</Step>
     <Step No="22" Types="D1" Temp="11" Secs="61" Macro="21">Enable Mixers1</Step>
     <Step No="222" Types="D2" Temp="22" Secs="62" Macro="23">Enable Mixers3</Step></Process>')
GO

--SELECT * FROM MyXML_Tbl
--GO

--SELECT
--  tb.ID,
--  tx.c.value('./@No', 'VARCHAR(128)') as StepNumber,
--  tx.c.query ('.') as Types
--from MyXML_Tbl tb
--CROSS APPLY tb.MyXML.nodes('Process/.[1]/*')as tx(c)

;With MyCTE01 as (
    SELECT
        tb.ID,
        tx.c.value('./@No', 'VARCHAR(128)') as StepNumber,
        tx.c.query ('.') as MyXML
    from MyXML_Tbl tb
    CROSS APPLY tb.MyXML.nodes('Process/.[1]/*')as tx(c)
)
SELECT 
    MyCTE01.id,
    MyCTE01.StepNumber,
    doc.Col.value('local-name(.[1])','VARCHAR(100)') ColumnName,
    doc.Col.value('.[1]','VARCHAR(100)') ColumnValue
FROM MyCTE01
CROSS APPLY MyCTE01.MyXML.nodes('/Step/@*') doc(Col)
WHERE not doc.Col.value('local-name(.[1])','VARCHAR(100)') = 'No'
GO

I hope this is useful. It should cover all cases mentioned in the discussion

  • Related