I'm trying to make a XML document. Especially, as below
<spirit:component xmlns:spirit="http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4"
xmlns:vendorExtensions="$IREG_GEN/XMLSchema/SPIRIT"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="$IREG_GEN/XMLSchema/SPIRIT/VendorExtensions.xsd
http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4
http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4/index.xsd">
So I made a perl script for this as below
use strict;
use warnings;
use Spreadsheet::ParseXLSX;
use XML::LibXML;
my $doc = XML::LibXML::Document->new('1.0', 'utf-8');
my $root = $doc->createElement('spirit:component');
#$root->appendChild($doc->createComment("JJ"));
$root->setAttribute('xmlns:spirit'=> "http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4");
$root->setAttribute('xmlns:vendorExtensions'=> "\$IREG_GEN/XMLSchema/SPIRIT");
$root->setAttribute('xmlns:xsi'=> "http://www.w3.org/2001/XMLSchema-instance");
$root->setAttribute('xsi:schemaLocation'=> "http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4
http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4
http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4/index.xsd");
$doc->setDocumentElement($root);
print $doc->toString(1);
But problem is that I got the result
<spirit:component xmlns:spirit="http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4" xmlns:vendorExtensions="$IREG_GEN/XMLSchema/SPIRIT" xmlns:xsi
="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4 											http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4 											http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4/index.xsd"/>
Especially, there are 2 problem here, 	
and index.xsd"/>
I can remove newline then I resolve it as the below
$root->setAttribute('xsi:schemaLocation'=> "http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4 http://www.spiritconsortium.org/XMLSchema/SPIRIT/1
.4 http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4/index.xsd");
Especially, how can I remove /
inindex.xsd"/>
? Did I use wrong function?
CodePudding user response:
In XML, a tag without any children or other enclosed content can be, and typically is, written as a single empty-element form <foo/>
instead of <foo></foo>
. It needs to be one or the other, though; unlike HTML, in XML every opening tag needs a closing one. So there's nothing wrong with that part of the output.
For the text of the xsi:schemaLocation
attribute (Which needs to have an even number of elements - it's pairs of namespace and schema URL)... 	
is a tab; replace them with spaces; those won't get encoded. The newlines still will, though. According to this answer to a SO question on if newlines are valid in attribute text, entities are converted to characters and all whitespace in an attribute should be converted to spaces by an XML parser when a program using one requests the content. So while it looks ugly, in practice with conforming XML parsers, what you have shouldn't cause issues.
Testing by piping the output of your script to this one:
#!/usr/bin/env perl
use warnings;
use strict;
use feature qw/say/;
use XML::LibXML;
my $dom = XML::LibXML->load_xml({ IO => \*STDIN });
my $root = $dom->documentElement();
for my $attr ($root->attributes()) {
say $attr->name(), " is ", $attr->getValue();
}
prints out
schemaLocation is http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4
http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4
http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4/index.xsd
xmlns:spirit is http://www.spiritconsortium.org/XMLSchema/SPIRIT/1.4
xmlns:vendorExtensions is $IREG_GEN/XMLSchema/SPIRIT
xmlns:xsi is http://www.w3.org/2001/XMLSchema-instance
so that seems to be true with libxml2, at least.