Home > database >  Get nodes from XML file where name has "."
Get nodes from XML file where name has "."

Time:06-08

I have a XML file of the following type

<timeseries><mrid>1</mrid><businesstype>A96</businesstype><flowdirection.direction>A02</flowdirection.direction><quantity_measure_unit.name>MAW</quantity_measure_unit.name> 

and I use the following code and it shows output {xml_nodeset (0)}. but when I try to search for "businesstype" then I have the results from the node. I think the problem is in "." inside the node name. How can be the problem solved using the same function as in my code?

library(rvest)
xml %>% html_elements("flowdirection.direction") %>% html_text()
xml %>% html_nodes("flowdirection.direction")

CodePudding user response:

The default selector type for html_elements (and html_nodes()) is a CSS selector. In CSS selectors the . has a special meaning to indicate the HTML "class" of a node. If you want to use CSS selectors, you have to escape the . with a backslash. For example

xml <- xml2::read_xml("<timeseries><mrid>1</mrid><businesstype>A96</businesstype><flowdirection.direction>A02</flowdirection.direction>

xml %>%  
  html_elements("flowdirection\\.direction")

or you could use xpath selectors

xml %>%  
  html_elements(xpath="flowdirection.direction")

This might make more sense because you seem to be working with raw XML data rather than HTML nodes.

  •  Tags:  
  • r xml
  • Related