Home > Net >  Are there alternative (other than XML library) ways to parse XML file in R?
Are there alternative (other than XML library) ways to parse XML file in R?


There are reasons why I cannot use XML package. I have a code using XML library. However, is it possible somehow to rewrite the code for rows below # XML package and still get the same results? Unfortunately, I cannot add a reproducible example using dput of the XML, as it does not show anything to copy and paste here.

Here, I found a link that shows how alternatively xml2 package can be used, but not for all functions.

#read url
url <- "https://transparency.entsoe.eu/api?securityToken=xxxx&documentType=A82&BusinessType=A96&controlArea_Domain=10YFI-1--------U&periodStart=202206020000&periodEnd=202206040000"

# find subset for timeseries
myXMLts_up <- xml_child(myXMLfile, search = 13, ns = xml_ns(myXMLfile)) 

# find subset for position/quantity data
myXMLpts_up <- xml_child(myXMLts_up, search = 7, ns = xml_ns(myXMLts_up))

# XML package
myXML  <- xmlTreeParse(my_data ,asText = TRUE, useInternal = TRUE)

myXML  <- xmlRoot(myXML )

# convert to dataframe
myXMLdf  <- xmlToDataFrame(myXML )

CodePudding user response:

Here is a solution with package rvest. The package title is

Wrappers around the 'xml2' and 'httr' packages to make it easy to download, then manipulate, HTML and XML.

The code is self-explanatory.


url <- "https://transparency.entsoe.eu/api?documentType=A82&BusinessType=A96&controlArea_Domain=10YFI-1--------U&periodStart=202206020000&periodEnd=202206040000"

myXMLfile <- read_html(url)

position <- myXMLfile %>%
  html_elements("position") %>%
  html_text() %>%

quantity <- myXMLfile %>%
  html_elements("quantity") %>%
  html_text() %>%

myXMLdf <- data.frame(position, quantity)
#>   position quantity
#> 1        1        0
#> 2        2        0
#> 3        3        0
#> 4        4       80
#> 5        5       80
#> 6        6       80

Created on 2022-06-06 by the reprex package (v2.0.1)

  •  Tags:  
  • r xml
  • Related