Keep only an element of a webpage while web-scraping-CodePudding

I am trying to extract a table from a webpage with python. I managed to get all the contents inside of that table, but since I am very new to webscrapping I don't know how to keep only the elements that I am looking for.

I know that I should look for this class in the code: <a , which specify the items in the table.

So how can I keep only those classes to then extract the title of them?

<a  title="r/Python" href="/r/Python/">r/Python</a>
<a  title="r/Java" href="/r/Java/">r/Java</a>

I miserably failed in writing a code for that. I don't know how I could extract only these classes, so any inputs will be highly appreciated.

CodePudding user response：

To extract the value of title attributes you can use list comprehension and you can use either of the following locator strategies:

Using CSS_SELECTOR:

print([my_elem.get_attribute("title") for my_elem in driver.find_elements(By.CSS_SELECTOR, "a._3BFvyrImF3et_ZF21Xd8SC[title]")])

Using XPATH:

print([my_elem.get_attribute("title") for my_elem in driver.find_elements(By.XPATH, "//a[@class='_3BFvyrImF3et_ZF21Xd8SC' and @title]")])

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

CodePudding user response：

Okay, I have made a very simple thing that worked.

Basically I pasted the code on VSCODE and the selected all the occurrences of that class. Then I just had to copy and paste in another file. Not sure why the shortcut CTRL Shift L did not work, but I have managed to get what I needed.

Select all occurrences of selected word in VSCode