Home > OS >  Accessing the elements of a list in R with something similar to xpath
Accessing the elements of a list in R with something similar to xpath

Time:02-02

I have the following list in R, called l (in the example I create it from XML, but my real example came from a different source):

library(xml2)
x <- read_xml("
  <foo> 
    <bar> <baz>apple</baz> <faz>cat</faz> </bar> 
    <bar> <baz>orange</baz> <faz>dog</faz> </bar> 
  </foo>")
l <- as_list(x)

l
$foo
$foo$bar
$foo$bar$baz
$foo$bar$baz[[1]]
[1] "apple"

$foo$bar$faz
$foo$bar$faz[[1]]
[1] "cat"


$foo$bar
$foo$bar$baz
$foo$bar$baz[[1]]
[1] "orange"

$foo$bar$faz
$foo$bar$faz[[1]]
[1] "dog"

I need to extract all the baz elements from this list. If I had the data in XML format, I could do this quite simply with xml2 using:

x |> xml_find_all("//faz") |> xml_text()
[1] "cat" "dog"
<faz>

But the solutions I've seen in R generally require complicated lapply() combinations that make my head hurt :-)

Is there any way to access the elements of a general R list in a similar manner? I'm not wedded to the xpath syntax per se, I'd be happy to have a solution that could somehow flatten an arbitrary list into names that looked like paths like /foo/bar/baz that I could then search through using grep() or similar, say:

names_as_paths(l) |> str_subset("/baz") |> extract_by_path(l)

CodePudding user response:

Check this recursive function, inspired by this answer:

get_elements <- function(x, element) {
  if(is.list(x))
  {
    if(element %in% names(x)) x[[element]]
    else unname(unlist(lapply(x, get_elements, element = element)))
  }
}

get_elements(l, "faz")
#[1] "cat" "dog"

CodePudding user response:

If I understand correctly, simply applying unlist could suffice:

L <- unlist(l) 
L
#> foo.bar.baz foo.bar.faz foo.bar.baz foo.bar.faz 
#>    "apple"       "cat"    "orange"       "dog" 

L[grep("baz",names(L))]
#> foo.bar.baz foo.bar.baz
#>    "apple"    "orange" 
  • Related