I have an XML file with USER_DEFINED parameters that I'm trying to parse out. Here is an example of the XML document.
<userDefinedParameters>
<USER_DEFINED parameter="P1">LEFT</USER_DEFINED>
<USER_DEFINED parameter="P2">RIGHT</USER_DEFINED>
<USER_DEFINED parameter="P3">1234</USER_DEFINED>
<USER_DEFINED parameter="P4">5678</USER_DEFINED>
</userDefinedParameters>
</data>
</segment>
</body>
</head>
I am able to parse out all data from this file using the XML
package and xpathApply
. However, I can't pull out the USER_DEFINED parameter values this way.
Since there are several records in the XML, I'd like to get all P1s, P2s, etc., as I get the other fields using xpathApply
. The document states all USER_DEFINED parameters are as 'parameter' and 'value' so I think I need to pull as c('paramater', 'value')
but I don't know how to do this using XML.
I have looked at this SO page, it helped a lot, but doesn't answer this question.
Thanks for any/all help.
UPDATED for desired output and how I'm trying to get the data. Note, the below code doesn't work as desired.
Current xpathApply
usage gets all USER_DEFINED rows within the userDefinedParameters
section. If I change to xpathApply(data, "//USER_DEFINED"), xmlValue)
then I get all values but no relation to the parameter name. I need something like xpathApply(data, "//USER_DEFINED/P1"), xmlValue)
but, obviously, this doesn't work.
Library(XML)
fileName <- "./file.xml"
data <- xmlParse(fileName)
xml_data <- xmlToList(data)
p1 <- xpathApply(data, "//USER_DEFINED")
p2 <- xpathApply(data, "//USER_DEFINED")
# View(p1)
# "P1"
# LEFT
# LEFT
# RIGHT
# View(p2)
# "P2"
# RIGHT
# RIGHT
# LEFT
# ...
CodePudding user response:
Using the xml2
library, you could get the values from a node for parameter
using xml_attr()
.
Something like this:
library(xml2)
x <- read_xml('<userDefinedParameters>
<USER_DEFINED parameter="P1">LEFT</USER_DEFINED>
<USER_DEFINED parameter="P2">right</USER_DEFINED>
<USER_DEFINED parameter="P3">1234</USER_DEFINED>
<USER_DEFINED parameter="P4">5678</USER_DEFINED>
</userDefinedParameters>')
dataset <- data.frame(user_defined = x %>%
xml_find_all("//USER_DEFINED") %>%
xml_text(),
parameter = x %>%
xml_find_all("//USER_DEFINED") %>%
xml_attr("parameter"))
Result in dataset
:
user_defined parameter
1 LEFT P1
2 right P2
3 1234 P3
4 5678 P4
CodePudding user response:
If you like to stick with the XML package, you can use the xmlAttrs
function inside sapply
text <-' <head> <body> <segment>
<data>
<userDefinedParameters>
<USER_DEFINED parameter="P1">LEFT</USER_DEFINED>
<USER_DEFINED parameter="P2">right</USER_DEFINED>
<USER_DEFINED parameter="P3">1234</USER_DEFINED>
<USER_DEFINED parameter="P4">5678</USER_DEFINED>
</userDefinedParameters>
</data>
</segment>
</body>
</head>'
library(XML)
doc <- xmlRoot(xmlParse(text))
nodes<-xpathApply(doc, ".//userDefinedParameters/USER_DEFINED")
attributes <- sapply(nodes, function(n) {
xmlAttrs(unlist(n)) })
values<-xmlValue(nodes)
data.frame(attributes, values)
# attributes values
# 1 P1 LEFT
# 2 P2 right
# 3 P3 1234
# 4 P4 5678