Home > Blockchain >  Don't understand XML load into SQL database
Don't understand XML load into SQL database

Time:08-01

I have the following XML file which i want to load into SQL:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns4:Dataview1 xmlns="ux:no::ehe:v5:actual:aver" xmlns:ns4="ux:no:ehe:v5:move"  xmlns:ns2="ux:no:ehe:v5:centre" xmlns:ns5="ux:no:ehe:v5:tran">
  <ns4:Content versionCode="16000">
    <ns4:gen start="1961-07-01" end="1961-07-01">
    </ns4:gen>
      <Weight unit="KG">30.0000</Weight>
    <ns4:gen start="2017-09-19">
    </ns4:gen>
    <ns4:gen start="1961-07-02" end="2016-09-30">
      <ns5:translation>
        <ns2:En>Example Data English 234354</ns2:En>
      </ns5:translation>
      <Type>Circle</Type>
      <Weight unit="KG">10.0000</Weight>
    </ns4:gen>
    <ns4:gen start="2016-10-01" end="2017-09-18">
      <ns5:translation>
        <ns2:De>Beispieldaten Deutschland 23443</ns2:De>
        <ns2:En>Example Data English 23443</ns2:En>
      </ns5:translation>
      <Weight unit="KG">20.0000</Weight>
    </ns4:gen>
  </ns4:Content>
</ns4:Dataview1>

Following code is used:

CREATE TABLE tbl (
    ID INT IDENTITY(1, 1) PRIMARY KEY,
    XmlColumn XML
);

INSERT INTO tbl(XmlColumn)
SELECT * FROM OPENROWSET(BULK N'C:\Data\demoset.xml', SINGLE_BLOB) AS x;

WITH XMLNAMESPACES(DEFAULT  'ux:no::ehe:v5:actual:aver',
                            'ux:no:ehe:v5:tran' as ns5,
                            'ux:no:ehe:v5:move' AS ns4,
                            'ux:no:ehe:v5:cat:fill' as ns3,
                            'ux:no:ehe:v5:centre' as ns2)

SELECT   c.value('@versionCode', 'VARCHAR(100)') as versionCode,
         x.value('@start', 'DATE') as Start_date,
         x.value('@end', 'DATE') as End_date,
         x.value('(ns5:translation/ns2:En)[1]', 'VARCHAR(100)') as Trans_EN,
         x.value('Type[1]', 'VARCHAR(100)') as Typed,
         x.value('(Weight/unit)[1]', 'VARCHAR(100)') as Unitdata,
         x.value('Weight[1]', 'VARCHAR(100)') as Weightdata
INTO dbo.testtbl
FROM tbl
    CROSS APPLY XmlColumn.nodes('/ns4:Dataview1/ns4:Content') AS t1(c)
    OUTER APPLY t1.c.nodes('ns4:gen') AS t2(x);

I have three questions:

1.) I don't understand why it is not possible to use @ns2:En, @Type, @Weight. Therefore, why do i need to place [1] behind data ns2:En, Type & Weight?

2.) Why is '(Weight/unit)[1]' not working?

3.) If I change outer apply to cross apply results are the same. I think i need to use outer apply since it is a left outer join in my opinion, but I don't understand completely. In my previous topic cross apply is used in the answer. Slow XML import with SQL server

CodePudding user response:

    • .value always needs a singleton: in other words it must be statically known (upfront) that there will be exactly one node to retrieve. This is normally done using a predicate [1] to retrieve the first node.
    • Retrieving a whole node using .value implicitly takes the text() node (of which there may be more than one), however it is better and more performant to specify it explicitly like (Type/text())[1]
    • In the case of attributes (see 2.), there is always one unique attribute, so this doesn't need [1]. But this only applies if you are retrieving a top-level attribute. If you go down a level then you may have multiple attributes, again requiring [1].
  1. Attributes are retrieved using @, so you need (Weight/@unit)[1].
  2. The .nodes function shreds the XML into separate rows. CROSS APPLY implies that you expect rows and to act like an INNER JOIN, whereas OUTER APPLY implies that there may not be rows but you still want nulls, like a LEFT JOIN. Up to you as to what you expect.
WITH XMLNAMESPACES(DEFAULT  'ux:no::ehe:v5:actual:aver',
                            'ux:no:ehe:v5:tran' as ns5,
                            'ux:no:ehe:v5:move' AS ns4,
                            'ux:no:ehe:v5:cat:fill' as ns3,
                            'ux:no:ehe:v5:centre' as ns2)

SELECT   c.value('@versionCode', 'VARCHAR(100)') as versionCode,
         x.value('@start', 'DATE') as Start_date,
         x.value('@end', 'DATE') as End_date,
         x.value('(ns5:translation/ns2:En/text())[1]', 'VARCHAR(100)') as Trans_EN,
         x.value('(Type/text())[1]', 'VARCHAR(100)') as Typed,
         x.value('(Weight/@unit)[1]', 'VARCHAR(20)') as Unitdata,
         x.value('(Weight/text())[1]', 'decimal(18,9)') as Weightdata
FROM tbl
    CROSS APPLY tbl.XmlColumn.nodes('/ns4:Dataview1/ns4:Content') AS t1(c)
    OUTER APPLY t1.c.nodes('ns4:gen') AS t2(x);

db<>fiddle

You don't need to create a table just to use OPENROWSET, you can feed it straight in:

WITH XMLNAMESPACES(DEFAULT  'ux:no::ehe:v5:actual:aver',
                            'ux:no:ehe:v5:tran' as ns5,
                            'ux:no:ehe:v5:move' AS ns4,
                            'ux:no:ehe:v5:cat:fill' as ns3,
                            'ux:no:ehe:v5:centre' as ns2)

SELECT   c.value('@versionCode', 'VARCHAR(100)') as versionCode,
         x.value('@start', 'DATE') as Start_date,
         x.value('@end', 'DATE') as End_date,
         x.value('(ns5:translation/ns2:En/text())[1]', 'VARCHAR(100)') as Trans_EN,
         x.value('(Type/text())[1]', 'VARCHAR(100)') as Typed,
         x.value('(Weight/@unit)[1]', 'VARCHAR(20)') as Unitdata,
         x.value('(Weight/text())[1]', 'decimal(18,9)') as Weightdata
FROM OPENROWSET(BULK N'C:\Data\demoset.xml', SINGLE_BLOB) AS blk(col)
    CROSS APPLY (VALUES (
        CAST(blk.col AS xml)
    ) ) AS t(XmlColumn)
    CROSS APPLY t.XmlColumn.nodes('/ns4:Dataview1/ns4:Content') AS t1(c)
    OUTER APPLY t1.c.nodes('ns4:gen') AS t2(x);
  • Related