The problem I'm trying to solve is like this: I'm trying to scrape some content from a web page, I'm using selenium, findElementByClassName
to get the element content, and it works great until now. But considering that the website that I'm scraping changes one of those element classes in html, I don't want to get an could not find element exception
making the rest of the code not execute and jumping straight into the catch block.
My idea was to put each line of code into a try catch block, but having about 15 fields that I want to scrape it makes the code look ugly. See for yourself:
String name = null;
String type = null;
String description = null;
try {
driver.get(link);
try {
name = driver.findElementByClassName(environment.getProperty("booking.propertyName")).getText();
}catch (Exception e){
log.error("error doing thing");
}
try {
type = driver.findElementByClassName(environment.getProperty("booking.propertyType")).getText();
}catch (Exception e){
log.error("error doing thing");
}
try {
description = driver.findElementByClassName(environment.getProperty("booking.propertyDescription")).getText();
}catch (Exception e){
log.error("error doing thing");
}
}catch (Exception e){
log.error("Error during scraping");
}
So if one of these things goes wrong, I still want the rest of the code to continue instead of when having one try-catch block where the first thing failing would stop the other things from executing. The code above works just fine but it does not look good so my question do you have any ideas of how I could make this better looking.
CodePudding user response:
There is no magic bullet for this. But the standard way avoid repetitive code is to refactor. For example:
try {
type = driver.findElementByClassName(environment.getProperty("something"))
.getText();
} catch (Exception e){
log.error("error doing thing");
}
can be rewritten as:
type = getElementTextIgnoringExceptions(driver, environment, "something");
where getElementTextIgnoringExceptions
has been defined as something like this:
public String getElementTextIgnoringExceptions(
Driver driver, Environment env, String name) {
try {
String className = env.getProperty(name);
return driver.findElementByClassName(className).getText();
} catch (Exception ex) {
log.error("error getting " name, ex);
return null;
}
}
However ... there are some bad things about the code that you are trying to simplify here:
- Catching
Exception
is bad. You have no idea what you will catch, or whether it is safe or sensible to continue. - Not logging the exception is bad. How are you going to diagnose the problem if all you have an "error doing thing" message in your log file?
- Continuing after the exceptions is (in the context of your application) liable to cause problems. The rest of your code will be littered with
null
checks to deal with the elements (or whatever) that couldn't be fetched. Miss one check and you are liable to get an NPE; e.g. in some edge-case that you didn't cover in your unit tests.
These issues are more significant than making the code look good.
If you are using Java 8 , it may be possible to refactor so that the logic is passed as lambda expressions. It depends on the nature of the variables used.