I am using xml files obatined from
https://eco2mix.rte-france.com/curves/getDonneesMarche?&dateDeb=31/12/2020&dateFin=24/02/2021&mode=NORM&_=1648578231712 (called WEEKS1
) and https://eco2mix.rte-france.com/curves/getDonneesMarche?&dateDeb=04/12/2021&dateFin=31/12/2021&mode=NORM&_=1648650611995 (called WEEKS7
) y downloaded the files and save them in my local folder.
Using these files I want to extract some information. More specifically a time series, so I use the following code:
library(XML)
library(methods)
library(purrr)
list.filenames<-list.files(pattern = "\\.xml")
France2022<-lapply(list.filenames, function(file) #Reading files in my local repo
xmlParse(file)
)
France2022<-map(France2022, xmlRoot)
Here I wanted to used an apply in my object France2022
for getting my data:
lapply(6:61, function(root)
xmlToDataFrame(France2022[[2]][[root]][[7]])) # the second list is associated with WEEKS7
but the following error appears:
Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘xmlToDataFrame’ for signature ‘"NULL", "missing", "missing", "missing", "missing"’
In this point I notice that one of this file has a problem. I do not know what is happening because both files have the same structure. I also tried to read the file using the ´https´ direction, but I have the same error:
F7<-read_xml("https://eco2mix.rte-france.com/curves/getDonneesMarche?&dateDeb=08/10/2021&dateFin=03/12/2021&mode=NORM&_=1648650611994")
F7<-xmlParse(F7)
lapply(6:61, function(root)
xmlToDataFrame(F7[[root]][[7]]))
CodePudding user response:
You could do the following:
require(tidyverse)
require(xml2)
dat <- read_xml("https://eco2mix.rte-france.com/curves/getDonneesMarche?&dateDeb=31/12/2020&dateFin=24/02/2021&mode=NORM&_=1648578231712")
dat %>%
xml_find_first("//donneesMarche") %>%
as_list() %>%
tibble::as_tibble(.name_repair = "unique") %>%
map_df(map_chr, simplify)
Resulting in
# A tibble: 11 × 24
valeur...1 valeur...2 valeur...3 valeur...4 valeur...5 valeur...6 valeur...7
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 43.74 38.01 36.75 33.06 31.67 34 40.44
2 43.36 39.78 37.27 34.09 33.29 35.86 42.83
3 ND ND ND ND ND ND ND
4 38.54 35 32.13 29.24 31.67 34 33.85
5 40.09 35.79 33.37 30.27 31.67 34 35.65
6 42.11 37.03 35.25 31.82 31.67 34 36.99
7 42.11 37.03 35.25 31.82 31.67 34 38.32
8 72.36 72.32 68.96 60.24 57.77 54.69 55.69
9 42.11 35.79 33.37 30.27 31.67 34 38.32
10 44.6 44.1 47 41.97 31.67 34 55
11 42.11 37.03 35.25 31.82 31.67 34 36.99
# … with 17 more variables: valeur...8 <chr>, valeur...9 <chr>,
# valeur...10 <chr>, valeur...11 <chr>, valeur...12 <chr>, valeur...13 <chr>,
# valeur...14 <chr>, valeur...15 <chr>, valeur...16 <chr>, valeur...17 <chr>,
# valeur...18 <chr>, valeur...19 <chr>, valeur...20 <chr>, valeur...21 <chr>,
# valeur...22 <chr>, valeur...23 <chr>, valeur...24 <chr>
Note that the third Element has ND
as Value.
Thats why its characters instead of double columns.
I do leave the tidying to you.
CodePudding user response:
I could find a solution reading the files with read_xml
and using unnest
Names<-c("BE", "CH", "DE", "DL", "AT", "ES", "FR", "GB", "IT", "NL", "PT")
list.filenames<-list.files(pattern = "\\.xml")
France2022<-lapply(list.filenames, function(file) #Reading files
read_xml(file)
)
France_data<-map(France2022, ~as_list(.)%>%
tibble::as.tibble()%>%
unnest_longer("liste")%>%
unnest_wider("liste")%>%
unnest(cols = names(.))%>%
unnest(cols = names(.))%>%
select(-c(1))%>%
drop_na()%>%
mutate_all(as.numeric)%>%
mutate(Area=rep_len(Names, length.out=n()))
)%>%
enframe%>% # convert list to tibble
unnest(value)
I am not sure why the error message appeared, apparently the files do not have errors and are defined as XML objects.