I need to remove empty nodes in XML using Xpath expressions .
Let's consider the below sample XML. In that, 'nickname' and 'height' nodes are not needed as they are empty.
Original Data
<class>
<student rollno = "393">
<firstname>Dinkar</firstname>
<lastname>Kad</lastname>
<nickname></nickname>
<marks>85</marks>
<height></height>
</student>
</class>
Expected Data
<class>
<student rollno = "393">
<firstname>Dinkar</firstname>
<lastname>Kad</lastname>
<marks>85</marks>
</student>
</class>
CodePudding user response:
You can use XPath with the predicate not(node()) to select all elements that do not have child nodes.
For example:
<?php
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->loadxml('<parentnode>
<tag1>2</tag1>
<tag2>4</tag2>
<tag3></tag3>
<tag2>4</tag2>
<tag3></tag3>
<tag2>4</tag2>
<tag3></tag3>
</parentnode>');
$xpath = new DOMXPath($doc);
foreach( $xpath->query('//*[not(node())]') as $node ) {
$node->parentNode->removeChild($node);
}
$doc->formatOutput = true;
echo $doc->savexml();
prints
<?xml version="1.0"?>
<parentnode>
<tag1>2</tag1>
<tag2>4</tag2>
<tag2>4</tag2>
<tag2>4</tag2>
</parentnode>
CodePudding user response:
The XPath expression to target those empty elements is: *[not(node())]
.
However, you can't transform the XML with XPath.
I would apply the following XSLT stylesheet, which has a default template that copies all content and one special template matching elements that do not have any child nodes (no element or text()
) that is empty, which means those elements get dropped.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[not(node())]"/>
</xsl:stylesheet>