I'm trying to webscrape the full list of projects and associated details from this site (the Project List is on the right hand side):
https://www.forest-trends.org/project-list/
I cannot seem to identify the correct css elements to get at the project and associated details. I wondered if this was something to do with JavaScript within the html?
When I try the following:
library(rvest)
link <- "https://www.forest-trends.org/project-list"
urlData <- link %>% read_html %>% html_nodes(".project-tile")
I would expect to get a list of projects. Instead I get:
{xml_nodeset (0)}
How to return the full list of projects and associated details?
CodePudding user response:
There an API
which you can use,
library(jsonlite)
df = fromJSON('https://www.forest-trends.org/wp-content/themes/foresttrends/map_tools/project_fetch.php?ids=')
head(df$markers)
lat lng type
1 -11.78449871 -70.73347813 Forest and land-use carbon
2 17.067346 94.459977 Forest and land-use carbon
3 3.054216 -72.333984 Forest and land-use carbon
4 20.98685 -89.03344 Forest and land-use carbon
5 -0.886093 30.5798 Forest and land-use carbon
6 -1.809978 31.131299 Forest and land-use carbon
title location pid size
1 Reforestadores REDD Project Madre de Dios, Peru 1 85000
2 Reforestation and Restoration of degraded mangrove lands, sustainable livelihood and community development in Myanmar Myanmar 2 2575
3 San Nicolas Carbon Sequestration Project San Nicholas, Colombia 3 7300
4 Amigos de Calakmul Mexico Selva Maya, Mexico 4 56700
5 Uganda Nile Basin Reforestation Project No 4 Uganda 5 347
6 Emiti Nibwo Bulora Nyaishozi, Tanzania 6 130