I'm trying to scrape randomly generated names from a website.
library(httr)
library(rvest)
url <- "https://letsmakeagame.net//tools/PlanetNameGenerator/"
mywebsite <- read_html(url) %>%
html_nodes(xpath="//div[contains(@id,'title')]")
However, that does not work. I'm assuming I have to «click» the «generate» button before extracting the content. Is there a simple way (without RSelenium
) to achieve that?
Something similar to:
POST(url,
body = list("EntryPoint.generate()" = T),
encode = "form") -> res
res_t <- content(res, as="text")
Thanks!
CodePudding user response:
rvest
isn't much of a help here as planet names are not requested from a remote service, names are generated locally with javascript, that's what the EntryPoint.generate()
call does. A relatively simple way is to use chromote, though its session/process closing seems kind of messy at the moment:
library(chromote)
b <- ChromoteSession$new()
{
b$Page$navigate("https://letsmakeagame.net/tools/PlanetNameGenerator")
b$Page$loadEventFired()
}
# call EntryPoint.generate(), read result from <p id="title></p> element,
# replicate 10x
replicate(10, b$Runtime$evaluate('EntryPoint.generate();document.getElementById("title").innerText')$result$value)
#> [1] "Torade" "Ukiri" "Giconerth" "Dunia" "Brihoria"
#> [6] "Tiulaliv" "Giahiri" "Zuthewei 4A" "Elov" "Brachomia"
b$close()
#> [1] TRUE
b$parent$close()
#> Error in self$send_command(msg, callback = callback_, error = error_, : Chromote object is closed.
b$parent$get_browser()$close()
#> [1] TRUE
Created on 2023-01-25 with reprex v2.0.2