Home > OS >  Maintaining relationship between XML nodes. SQL. XPath
Maintaining relationship between XML nodes. SQL. XPath

Time:09-15

I am trying to pull data from an XML file using SQL / XPath

For my example lets say that the table is called [Books] and the column that contains the XML is called [XML]

Example XML:

<Books>
 <Book>
  <Title>Apology</Title>
  <Author>Socrates</Author>
 </Book>
 <Book>
  <Title>Republic</Title>
  <Author>Plato</Author>
 </Book>
 <Book>
  <Title>Politics</Title>
  <Author>Aristotle</Author>
 </Book>
</Books>

Desired Result:

Title Author
Apology Socrates
Republic Plato
Politics Aristotle

I am able to pull the distinct 'Titles' with a query like this:

SELECT 
a.x.value('text()[1]', 'VARCHAR(100)') AS 'Title'
FROM [dbo].[Books]
CROSS APPLY [XML].nodes('//Books/Book/Title') AS a (x)

Result:

Title
Apology
Republic
Politics

However, I am unable to pull the other nodes while maintaining their relationship with eachother.

For example, if i add a second 'Cross Apply':

SELECT 
a.x.value('text()[1]', 'VARCHAR(100)') AS 'Title',
b.x.value('text()[1]', 'VARCHAR(100)') AS 'Author'
FROM [dbo].[Books]
CROSS APPLY [XML].nodes('//Books/Book/Title') AS a (x)
CROSS APPLY [XML].nodes('//Books/Book/Author') AS b (x)

The result will be something like this:

Title Author
Apology Socrates
Apology Plato
Apology Aristotle
Republic Plato
Republic Socrates
Republic Aristotle
Politics Aristotle
Politics Plato
Politics Socrates

I have also tried to pull both nodes using one cross apply:

SELECT 
a.x.value('(//Title/text())[1]', 'VARCHAR(100)') AS 'Title',
a.x.value('(//Author/text())[1]', 'VARCHAR(100)') AS 'Author'
FROM [dbo].[Books]
CROSS APPLY [XML].nodes('//Books/Book') AS a (x)

When i try this i just get the first node duplicated:

Title Author
Apology Socrates
Apology Socrates
Apology Socrates

Is there any way i can preserve the relationship between these 2 nodes?

Thank you

CodePudding user response:

In this particular instance, you don't need to correlate anything. You can just shred the Books/Book node, then select each value separately

SELECT 
  x1.Book.value('(Title/text())[1]', 'VARCHAR(100)') AS Title,
  x1.Book.value('(Author/text())[1]', 'VARCHAR(100)') AS Author
FROM dbo.Books b
CROSS APPLY b.XML.nodes('Books/Book') AS x1 (book);

If you really need to correlate them (for example if there are multiple Title nodes), you can just pass the result of one .nodes to the next

SELECT 
  x2.Title.value('text()[1]', 'VARCHAR(100)') AS Title,
  x3.Author.value('text()[1]', 'VARCHAR(100)') AS Author
FROM dbo.Books b
CROSS APPLY b.XML.nodes('Books/Book') AS x1 (Book)
CROSS APPLY x1.Book.nodes('Title') AS x2 (Title)
CROSS APPLY x1.Book.nodes('Author') AS x3 (Author);

db<>fiddle

Also note that the // all-descendants axis is slow, and you should favour the / direct-child axis if possible.

  • Related