Home > Mobile >  If-statement in RSelenium
If-statement in RSelenium

Time:05-06

I have a vast list of chemicals for that I need to extract the CAS number. I have written a for loop which works as intended. However, when a chemical name is not found on the website, my code obviously stops.

Is there a way to account for this in the for loop? So that when a search query is not found, the loop goes back to the start page and searches for the next item in the list?

Down below is my code for the for loop with a short list of names to search for:

library(RSelenium)
library(netstat)

# start the server

rs_driver_object <- rsDriver(browser = "firefox",
                             verbose = FALSE,
                             port = 4847L) # change number if port is not open

# create a client object
remDrCh <- rs_driver_object$client

items <- c("MCPA", "DEET", "apple")
numbers <- list()
for (i in items) {
  Sys.sleep(2)
  remDrCh$navigate("https://commonchemistry.cas.org/")
  search_box <- remDrCh$findElement(using = 'class', 'search-input')
  search_box$sendKeysToElement(list(paste(i), key = 'enter'))
  Sys.sleep(2)
  result <- remDrCh$findElement(using = "class", "result-content")
  result$clickElement()
  Sys.sleep(2)
  cas <- remDrCh$findElements(using = 'class', 'cas-registry-number')
  cas_n <- lapply(cas, function (x) x$getElementText()) 
  numbers[[i]] <- unlist(cas_n)
  Sys.sleep(2)
  remDrCh$navigate("https://commonchemistry.cas.org/")
  Sys.sleep(2)
}

The problem lies in the result <- remDrCh$findElement(using = "class", "result-content") part. For "apple" there is no result, and thus no element that R could use.

I tried to write a separate if else argument for that specific part, but to no avail. This still only works for queries that yield a result. I also tried to use findElements but this only helps for the case when no result is found.

result <- remDrCh$findElement(using = "class", "result-content")
if (length(result) > 0) {
  result$clickElement()
} else {
  remDrCh$navigate("https://commonchemistry.cas.org/")
}

I also tried to use this How to check if an object is visible in a webpage by using its xpath? but I cannot get it to work on my example.

Any help would be much appreciated!

CodePudding user response:

This should work

items <- c("MCPA", "apple", "DEET")
numbers <- list()
for (i in items) {
  Sys.sleep(2)
  remDrCh$navigate("https://commonchemistry.cas.org/")
  search_box <- remDrCh$findElement(using = 'class', 'search-input')
  search_box$sendKeysToElement(list(paste(i), key = 'enter'))
  Sys.sleep(2)
  result <- try(remDrCh$findElement(using = "class", "result-content"))
  if(!inherits(result, "try-error")){
  result$clickElement()
  Sys.sleep(2)
  cas <- remDrCh$findElements(using = 'class', 'cas-registry-number')
  cas_n <- lapply(cas, function (x) x$getElementText()) 
  numbers[[i]] <- unlist(cas_n)
  }else{
    numbers[[i]] <- NA
  }
  Sys.sleep(2)
  remDrCh$navigate("https://commonchemistry.cas.org/")
  Sys.sleep(2)
}

Note the try() wrapper around the problematic code:

  result <- try(remDrCh$findElement(using = "class", "result-content"))

This will capture the error if there is one, but allow the loop to continue. Then, there is an if statement that tries to find the result if the output from try is not of class "try-error" otherwise, it returns the number as NA.

  • Related