When I run:
#!/usr/bin/env ruby
require 'nokogiri'
xml = <<-EOXML
<pajamas>
<bananas>
<foo>bar</foo>
<bar>bar</bar>
<1>bar</1>
</bananas>
</pajamas>
EOXML
doc = Nokogiri::XML(xml)
puts doc.at('/pajamas/bananas/foo')
puts doc.at('/pajamas/bananas/bar')
puts doc.at('/pajamas/bananas/1')
I get an ERROR: Invalid expression: /pajamas/bananas/1 (Nokogiri::XML::XPath::SyntaxError)
Is this a case of Nokogiri not liking ints as node names and/or is there a work around?
Looking at the documentation, I did not see a workaround to this. Removing the last line eliminates the error and prints the first two nodes as expected.
CodePudding user response:
An XML element with a name that starts with a number is invalid XML.
XML elements must follow these naming rules:
- Names can contain letters, numbers, and other characters
- Names cannot start with a number or punctuation character
- Names cannot start with the letters xml (or XML, or Xml, etc)
- Names cannot contain spaces Any name can be used, no words are reserved.
You're trying to parse invalid XML with a XML parser, it's just not going to work. If you're really getting <1>
as a tag and can't control that somehow, I'd suggest replacing the tags using a regex before getting to nokogiri.