I have an xml response containing Body
and Header
nodes, how can I access the value of the $Envelope$Body$checkVatResponse$valid
node?
For some reason I already can't find the Body
using xml_find_all
library(httr)
library(dplyr)
library(rvest)
library(xml2)
body = r'[<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" >
<soapenv:Header/>
<soapenv:Body>
<urn:checkVat xmlns:urn="urn:ec.europa.eu:taxud:vies:services:checkVat:types">
<urn:countryCode>NL</urn:countryCode>
<urn:vatNumber>800938495B01</urn:vatNumber>
</urn:checkVat>
</soapenv:Body>
</soapenv:Envelope>]'
r <- POST("http://ec.europa.eu/taxation_customs/vies/services/checkVatTestService", body = body)
stop_for_status(r)
content(r) %>% xml_find_all('//Body')
content(r) %>% xml2::as_list()
res <- content(r)
xml_children(res) %>% xml_name()
# [1] "Header" "Body"
xml_find_all(res,'.//Body')
# {xml_nodeset (0)}
CodePudding user response:
When working with XML data, you need to be mindful of the namespaces used in the file. You need to previx namespaced nodes with the correct namespace. To extract the valid
value you can use
content(r) %>% xml_find_all('//env:Body/ns2:checkVatResponse/ns2:valid')
To see all the namespaces used by the file you can run
content(r) %>% xml_ns()
# env <-> http://schemas.xmlsoap.org/soap/envelope/
# ns2 <-> urn:ec.europa.eu:taxud:vies:services:checkVat:types