Home > Mobile >  How can you continue the for loop in R even after an error?
How can you continue the for loop in R even after an error?

Time:03-08

I am parsing data from multiple links. But some of those links got broken after a while. And when I parse using rvest package it shows an error or warning. What can I do to continue parsing with for-loop, so it moves to the next line.


house_link <- "https://lalafo.kg/bishkek/ads/104-seria-2-komnaty-47-kv-m-s-mebelu-kondicioner-zivotnye-ne-prozivali-id-95221626"
house_features = data.frame()

for(x in 1:length(house_link)) {
  
   tryCatch({
      page_data = read_html(house_link[x])
      message("Executed.")
  }, error = function(e){
      message('Caught an error!')
      print(e)
  }, warning = function(w){
      message('Caught an warning!')
      print(w)
  }, finally = {
      message('All done, quitting.')
  }
)    
    pricing = page_data %>% html_nodes(".css-13sm4s4") %>% 
      html_element("span") %>% html_text() 
    house_features = rbind(house_features, data.frame(pricing, stringsAsFactors = FALSE))
}

CodePudding user response:

Maybe something like this?

library(rvest)

house_link <- "https://lalafo.kg/bishkek/ads/104-seria-2-komnaty-47-kv-m-s-mebelu-kondicioner-zivotnye-ne-prozivali-id-95221626"
house_features = data.frame()

for(x in 1:3) { # seq_along(house_link)  <- if you have more than 1 link this is the correct method
  
  cat('Link', x)
  
  start_time <- Sys.time()
  if (x %% 200 == 0) {
    Sys.sleep(5)
    print("pausing ...")}
  
  page_data <- tryCatch({
    page_data = read_html(house_link[x])
    message("Executed.")
  }, error = function(e){
    message('\nCaught an error!')
    return(NA) # here a return variable for testing is returned in the error condition - notice that this has to be initiated with the return function
  }, finally = {cat('Continuing with', x 1,'\n')})   #; next()})  <-  disabled next()
  
  ## This part is handled by finally next()
  ############################
  if(is.na(page_data)){      #
    cat('this is a test\n')  #
    next()                   #
    }                        #
  ############################
  
  else{  # else is not strictly necessary but the point may be easier to contextualised like this
    pricing = page_data %>% html_nodes(".css-13sm4s4") %>% 
      html_element("span") %>% html_text() 
    house_features = rbind(house_features, data.frame(pricing, stringsAsFactors = FALSE))
  }
}

Link 1
Caught an error!
Continuing with 2 
this is a test

Link 2
Caught an error!
Continuing with 3 
this is a test

Link 3
Caught an error!
Continuing with 4 
this is a test
  • Related