I was wondering how I can pull text from a website using Selenium and Python 3. I don't know what the text is, so I can't just look for the sentence and copy it. Here is an example screenshot: Example Problem. Know in this scenario I am looking for the small amount of text right after the 1. but it is represented by just ::header, so I am having trouble grabbing it. Any ideas? Thanks! Also the website I am pulling from is Quia.
Thanks!
CodePudding user response:
It's hard to answer directly because this web example is behind login. Broadly speaking you may use xpath expressions which needs information about xml/html tree(In example available under F12 button on PC keyboard when using Chrome or Firefox. „Inspect” from contex mouse menu is also the way). Example on login page of same server to get welcome text:
from selenium import webdriver
from selenium.webdriver.common.by import By
def s_obj(sel_drv, xph):
return sel_drv.find_elements(by=By.XPATH, value = f"{xph}")
def s_text(sel_drv, xph):
els = s_obj(sel_drv, xph)
return '; '.join(el.text.replace('\n', '; ')\
for el in els).strip(';').strip() if els else ''
test_url = "https://www.quia.com/web"
sel_drv = webdriver.Chrome()
sel_drv.get(test_url)
bs_xph = "//*/table/tbody/tr/td[@colspan=\"5\"]/h1[@class=\"home\"]"
expected_txt = s_text(sel_drv, f"{bs_xph}[1]")
print(expected_txt)
sel_drv.quit()