I want to make a crawler to gather some information from a website. I'm using Python and Selenium for this purpose. The problem is that elements cannot be found with class names but can be found using XPATH. The code that I'm using for this is as bellow:
HTML:
<h1 >
Xiaomi Redmi Note 11 Dual SIM 128GB And 6GB RAM Mobile Phone
</h1>
Selenium XPATH: (Working Solution)
product_name = driver.find_element(By.XPATH, "/html/body/div[1]/div[1]/div[3]/div[3]/div[2]/div[2]/div[2]/div[1]/div/h1").text.strip()
Selenium CLASS NAME: (Not working solution)
product_name = driver.find_element(By.CLASS_NAME, "txt-h4 clr-900 lf-2").text.strip()
I also tried this approach using beautifulsoup4, but the result was same with class names:
product_name = page_soup.find("h1", {"class":['txt-h4 clr-900 lf-2']}).text.strip()
The error that I get with this solution is:
AttributeError: 'NoneType' object has no attribute 'text'
What I need to do is to be able to locate elements by class name because of granularity.
CodePudding user response:
I can't get it because you enumerate multiple classes.
instead of this line product_name = driver.find_element(By.CLASS_NAME, "txt-h4 clr-900 lf-2").text.strip()
you should use this line product_name = driver.find_element(By.CLASS_NAME, "txt-h4").text.strip()
not "txt-h4 clr-900 lf-2"
but just "txt-h4"
CodePudding user response:
You cannot do it using search by class name as this element has multiple classes and this method compares provided string against each of them.
The only solution here is to search using:
xpath
driver.find_element(By.XPATH, "//*[@class='txt-h4 clr-900 lf-2']")
or css selector
driver.find_element(By.CSS_SELECTOR, ".txt-h4.clr-900.lf-2")