Home > other >  How to remove only empty nodes in XML using Xpath expressions?
How to remove only empty nodes in XML using Xpath expressions?

Time:12-11

I need to remove empty nodes in XML using Xpath expressions .

Let's consider the below sample XML. In that, 'nickname' and 'height' nodes are not needed as they are empty.

Original Data


<class>
   <student rollno = "393">
      <firstname>Dinkar</firstname>
      <lastname>Kad</lastname>
      <nickname></nickname>
      <marks>85</marks>
      <height></height>
   </student>
</class>

Expected Data


<class>
   <student rollno = "393">
      <firstname>Dinkar</firstname>
      <lastname>Kad</lastname>
      <marks>85</marks>
   </student>
</class>

CodePudding user response:

You can use XPath with the predicate not(node()) to select all elements that do not have child nodes.

For example:

<?php
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->loadxml('<parentnode>
   <tag1>2</tag1>
   <tag2>4</tag2>
   <tag3></tag3>
   <tag2>4</tag2>
   <tag3></tag3>
   <tag2>4</tag2>
   <tag3></tag3>
</parentnode>');

$xpath = new DOMXPath($doc);

foreach( $xpath->query('//*[not(node())]') as $node ) {
$node->parentNode->removeChild($node);
}

$doc->formatOutput = true;
echo $doc->savexml();

prints

<?xml version="1.0"?>
   <parentnode>
     <tag1>2</tag1>
     <tag2>4</tag2>
     <tag2>4</tag2>
     <tag2>4</tag2>
   </parentnode>

CodePudding user response:

The XPath expression to target those empty elements is: *[not(node())].

However, you can't transform the XML with XPath.

I would apply the following XSLT stylesheet, which has a default template that copies all content and one special template matching elements that do not have any child nodes (no element or text()) that is empty, which means those elements get dropped.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>
    
    <xsl:template match="node()|@*">
      <xsl:copy>
          <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>            
    </xsl:template>
    
    <xsl:template match="*[not(node())]"/>
    
</xsl:stylesheet>
  • Related