This is a link to a table with a table of ~290 Vine Plant names:
https://www.forestryimages.org/browse/catsubject.cfm?cat=51
I am trying to read in the table and keep the Common Names
column. I have tried doing this with with the rvest
library like so:
vine_web <- "https://www.forestryimages.org/browse/catsubject.cfm?cat=51"
vine_names <- vine_web %>%
read_html() %>%
html_table()
It reads the column names, but not the contents of the table. I have tried several reiterations using html_nodes
, html_element
, copying the css selector, and even the xpath.
I always end up with this as a result:
[[1]]
# A tibble: 1 x 4
`Subject Number` `Common Name` `Scientific Name` `Number Of Images`
<lgl> <lgl> <lgl> <lgl>
1 NA NA NA NA
The table is in a dynamic format, which leads me to believe that html_table()
may need to be altered or may be the inappropriate function to use here. I would like to know if there is a way to read this table into R.
CodePudding user response:
You need javascript to enable that table it appears, but there is a workaround to download the data in json form. If you inspect and go to the network tab, there is a json link. Let me know if this answers your question.
library(jsonlite)
json_data <- jsonlite::fromJSON("https://api.bugwood.org/rest/api/subject/.json?fmt=datatable&include=count&cat=51&systemid=2&draw=2&columns[0][data]=0&columns[0][searchable]=false&columns[0][orderable]=false&columns[0][search][value]=&columns[1][data]=1&columns[1][searchable]=true&columns[1][orderable]=true&columns[1][search][value]=&columns[2][data]=2&columns[2][searchable]=true&columns[2][orderable]=true&columns[2][search][value]=&columns[3][data]=3&columns[3][searchable]=false&columns[3][orderable]=true&columns[3][search][value]=&order[0][column]=1&order[0][dir]=asc&start=163&length=126&search[value]=&_=1657572710039")
as.data.frame(json_data$data)