I have a XML file with a size of 1GB. I use the following code to load the data into sql server.
DECLARE @xmlvar XML
SELECT @xmlvar = BulkColumn
FROM OPENROWSET(BULK 'C:\Data\demo.xml', SINGLE_BLOB) x;
WITH XMLNAMESPACES(DEFAULT 'ux:no::ehe:v5:actual:aver',
'ux:no:ehe:v5:move' AS ns4,
'ux:no:ehe:v5:cat:fill' as ns3,
'ux:no:ehe:v5:centre' as ns2)
SELECT
zs.value(N'(../@versionCode)', 'VARCHAR(100)') as versionCode,
zs.value(N'(@Start)', 'VARCHAR(50)') as Start_date,
zs.value(N'(@End)', 'VARCHAR(50)') as End_date
into testtbl
FROM @xmlvar.nodes('/ns4:Dataview1/ns4:Content/ns4:gen') A(zs);
I takes now more than 2 hours to run the query and it is not finished. I have tested the query with a smaller version of the XML file and that works. Any tips on improving the loading speed?
Thank you.
CodePudding user response:
In my opinion it's better to use an SSIS Package for importing XML files. It has a component named "XML Source" for loading XML file.
There is a useful article at : https://www.sqlshack.com/import-xml-documents-into-sql-server-tables-using-ssis-packages/
CodePudding user response:
(1) As @Stu already pointed out, loading XML file first into a single row table will speed up the process of loading significantly.
(2) it is not a good idea to traverse XML up in the XPath expressions. Like here:
c.value('../@versionCode', 'VARCHAR(100)') as versionCode
But the XML structure was not shared in the question. So, it is impossible to suggest anything concrete.
Check it out below.
SQL
DECLARE @tbl TABLE(
ID INT IDENTITY(1, 1) PRIMARY KEY,
XmlColumn XML
);
INSERT INTO @tbl(XmlColumn)
SELECT * FROM OPENROWSET(BULK N'C:\Data\demo.xml', SINGLE_BLOB) AS x;
WITH XMLNAMESPACES(DEFAULT 'ux:no::ehe:v5:actual:aver',
'ux:no:ehe:v5:move' AS ns4,
'ux:no:ehe:v5:cat:fill' as ns3,
'ux:no:ehe:v5:centre' as ns2)
SELECT c.value('../@versionCode', 'VARCHAR(100)') as versionCode,
c.value('@Start', 'VARCHAR(50)') as Start_date,
c.value('@End', 'VARCHAR(50)') as End_date
INTO testtbl
FROM @tbl
CROSS APPLY XmlColumn.nodes('/ns4:Dataview1/ns4:Content/ns4:gen') AS t(c);