I'm trying to parse an XML document with namespaces with PHP to output HTML, retaining its original structure.
I have used XPATH and foreach loops in my code below to render headings, paragraphs and lists but this doesn't respect the original structure of the document. I'm also unclear on how to render something like a URL that is embedded within the content that is also wrapped in an XML tag.
XML example:
<a:section>
<c:ref value="1">1</c:ref>
<c:title>Title of content</c:title>
<f:subsection>
<c:ref value="1.1">1.1</c:ref>
<c:title>Subsection title</c:title>
<b:content>Make sure you check out this link: <c:url address="www.google.com" type="https">google.com</c:url> and then review the list below:</b:content>
<c:list type="bullet">
<c:listitem>
<b:content>bullet item 1</b:content>
</c:listitem>
<c:listitem>
<b:content>bullet item 2</b:content>
</c:listitem>
<c:listitem>
<b:content>bullet item 3</b:content>
</c:listitem>
</c:list>
<b:content>More content here in text form</b:content>
</f:subsection>
</a:section>
PHP example:
$xml = file_get_contents('content.xml');
$sxml = new SimpleXmlElement($xml);
$section = $sxml->xpath('//a:section');
foreach ($section as $s) {
$sectionnumber = $s->xpath('c:ref');
$title = $s->xpath('c:title');
foreach ($title as $t) {
echo '<h2>'.$sectionnumber[0].' '.$t.'</h2>';
}
}
$subsection = $s->xpath('f:subsection');
foreach ($subsection as $ss) {
$subheadingnumber = $ss->xpath('c:ref');
$subheading = $ss->xpath('c:title');
foreach ($subheading as $sh) {
echo '<h3>'.$subheadingnumber[0].' '.$sh.'</h3>';
}
$paragraphs = $ss->xpath('b:content');
foreach ($paragraphs as $p){
echo '<p>'.$p.'</p>';
}
$lists = $ss->xpath('c:list');
foreach ($lists as $l){
$listitem = $l->xpath('c:listitem');
foreach ($listitem as $item){
$listcontent = $item->xpath('b:content');
foreach ($listcontent as $a){
echo '<li>'.$a.'</li>';
}
}
}
}
CodePudding user response:
You are missing the document element with the namespace definitions. They are important and you should not rely on the prefixes (they can change and are optional for elements).
For this answer I added a document element with dummy namespaces.
<?xml version="1.0" encoding="utf-8" ?>
<a:content
xmlns:a="urn:a"
xmlns:b="urn:b"
xmlns:c="urn:c"
xmlns:f="urn:f">
<a:section>
...
XSLT is a templating language for exactly this purpose. It allows you to define matches for nodes and transforming them:
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:a="urn:a"
xmlns:b="urn:b"
xmlns:c="urn:c"
xmlns:f="urn:f"
exclude-result-prefixes="a b c f">
<xsl:output method="html"/>
<xsl:template match="/*">
<div>
<xsl:for-each select="a:section">
<h2><xsl:value-of select="c:title"/></h2>
<xsl:for-each select="f:subsection">
<h3><xsl:value-of select="c:title"/></h3>
<div><xsl:apply-templates select="b:content|c:list"/></div>
</xsl:for-each>
</xsl:for-each>
</div>
</xsl:template>
<xsl:template match="b:content">
<p><xsl:apply-templates/></p>
</xsl:template>
<xsl:template match="c:list">
<ul>
<xsl:for-each select="c:listitem">
<li><xsl:apply-templates/></li>
</xsl:for-each>
</ul>
</xsl:template>
<xsl:template match="c:url">
<a href="{@type}://{@address}"><xsl:apply-templates/></a>
</xsl:template>
</xsl:stylesheet>
Take care to match the namespaces from you XML document.
PHP will load the XML and the template and process it:
// load the content
$content = new DOMDocument();
$content->load(__DIR__.'/content.xml');
// load the template
$template = new DOMDocument();
$template->load(__DIR__.'/transform.xsl');
// bootstrap XSLT
$processor = new XSLTProcessor();
$processor->importStylesheet($template);
// transform and output
echo $processor->transformToXml($content);
Output:
<div>
<h2>Title of content</h2>
<h3>Subsection title</h3>
<div>
<p>Make sure you check out this link: <a href="https://www.google.com">google.com</a> and then review the list below:</p>
<ul>
<li><p>bullet item 1</p></li>
<li><p>bullet item 2</p></li>
<li><p>bullet item 3</p></li>
</ul>
<p>More content here in text form</p>
</div>
</div>