I'm trying to wait the webpage to fully load before I proceed and find some elements.
1. If I do
EC.presence_of_element_located((By.XPATH, "//*[contains(text(), 'my text 1234567')]"))
I will get
<selenium.webdriver.support.expected_conditions.presence_of_element_located at 0x143304641c0>
This means my text is found right?
2. But If I do
WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.XPATH, "//*[contains(text(), 'my text 1234567')]")))
I will get
selenium.common.exceptions.TimeoutException: Message:
3. Then I checked
driver.find_elements_by_xpath("//*[contains(text(), 'my text 1234567')]")
Out[55]: []
4. If I do
driver.page_source.find('my text 1234567')
Out[64]: 971
I'm very confused. Why these would happen? Should I modify my By.XPATH?
CodePudding user response:
Number 1: this doesn't return a text object, and as stated here it simply checks "that an element is present on the DOM of a page. This does not necessarily mean that the element is visible."
Number 2: I believe this gives you an error because it can't find anything
Number 3: you're using
find_elementS_by_xpath
, (plural) so it looks for all the elements, but since it doesn't find any it returns an empty list.Number 4: you're using the
.find
method on a string, which returns the position (in terms of characters) of your text inside the html text of the page.
For further help, could you share the url that you're using and the element that you want to extract?
CodePudding user response:
I'm not sure what your wanted result is. If you simply want to return the text "Securities Exchange Act of 1934", this is an option:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
path = "YOUR PATH HERE\chromedriver.exe"
driver = webdriver.Chrome(path)
wait = WebDriverWait(driver, 5)
driver.get("https://www.sec.gov/Archives/edgar/data/896397/000089639701500011/seh10q2qtr2001.htm")
wait.until(EC.presence_of_element_located((By.TAG_NAME, "p")))
paragraph = driver.find_elements_by_tag_name("p")[1].text
line = paragraph.split("\n")[3]
result = line[60:]
print(result)
Instead, if you want to return the first paragraph where "Securities Exchange Act of 1934" is present, this is an option, following a similar syntax to what you were using:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
path = "YOUR PATH HERE\chromedriver.exe"
driver = webdriver.Chrome(path)
wait = WebDriverWait(driver, 5)
driver.get("https://www.sec.gov/Archives/edgar/data/896397/000089639701500011/seh10q2qtr2001.htm")
paragraph = wait.until(EC.presence_of_element_located((By.XPATH, "//p[contains(.,'Securities Exchange Act of 1934')]")))
print(paragraph.text)