I want to get the US pronunciation from Cambridge Dictionary my code so far is here:
from selenium import webdriver
from YDSData import tdata, dummydata
words = dummydata
link='https://dictionary.cambridge.org/dictionary/english'
driver = webdriver.Chrome()
for word in words:
driver.get(link "/" str(i))
try:
result = driver.find_elements_by_xpath('//*[@id="page-content"]/div[2]/div[1]/div[2]/div/div[3]/div/div/div[1]/div[2]/span[2]/span[3]/span')
print(content)
except:
driver.close()
This code supposed to give me the US pronunciation on Cambridge dictionary but it prints:
[<selenium.webdriver.remote.webelement.WebElement (session="285ac250c8925dd19eb01a7853c1f219", element="6f609e22-76ab-443d-8216-4ac90aefda20")>]
[<selenium.webdriver.remote.webelement.WebElement (session="285ac250c8925dd19eb01a7853c1f219", element="c7d8a664-d162-4d2c-8c87-dd8e10211024")>]
[<selenium.webdriver.remote.webelement.WebElement (session="285ac250c8925dd19eb01a7853c1f219", element="7f163600-d25f-4000-893f-44693736ed41")>]
also, it is extremely slow is something wrong with the code?
EDIT:
I've rewritten the code in the answers section to figure out the problem now the code looks like:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from YDSData import tdata, dummydata
driver = webdriver.Chrome()
driver.get("https://dictionary.cambridge.org/dictionary/english/dictionary")
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[text()='us']//following-sibling::span[@class='pron dpron']"))).get_attribute("innerHTML"))
It works but it took 8 minutes for the code to work giving these errors:
[5836:11428:0223/004955.530:ERROR:ssl_client_socket_impl.cc(995)] handshake failed; returned -1, SSL error code 1, net_error -201
[11580:7744:0223/005141.150:ERROR:gpu_init.cc(454)] Passthrough is not supported, GL is disabled, ANGLE is
[14788:5864:0223/005241.189:ERROR:chrome_browser_main_extra_parts_metrics.cc(227)] START: ReportBluetoothAvailability(). If you don't see the END: message, this is crbug.com/1216328.
[14788:5864:0223/005241.197:ERROR:chrome_browser_main_extra_parts_metrics.cc(230)] END: ReportBluetoothAvailability()
[14788:5864:0223/005241.215:ERROR:chrome_browser_main_extra_parts_metrics.cc(235)] START: GetDefaultBrowser(). If you don't see the END: message, this is crbug.com/1216328.
[14788:5864:0223/005241.243:ERROR:chrome_browser_main_extra_parts_metrics.cc(239)] END: GetDefaultBrowser()
[14788:11300:0223/005241.291:ERROR:device_event_log_impl.cc(214)] [00:52:41.291] USB: usb_device_handle_win.cc:1049 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)
CodePudding user response:
These are:
[<selenium.webdriver.remote.webelement.WebElement (session="285ac250c8925dd19eb01a7853c1f219", element="6f609e22-76ab-443d-8216-4ac90aefda20")>]
[<selenium.webdriver.remote.webelement.WebElement (session="285ac250c8925dd19eb01a7853c1f219", element="c7d8a664-d162-4d2c-8c87-dd8e10211024")>]
[<selenium.webdriver.remote.webelement.WebElement (session="285ac250c8925dd19eb01a7853c1f219", element="7f163600-d25f-4000-893f-44693736ed41")>]
the WebElements when you print them on the console.
To print the text /ˈdɪk.ʃən.er.i/ you can use the following locator strategy:
Using xpath:
driver.get("https://dictionary.cambridge.org/dictionary/english/dictionary") print(driver.find_element(By.XPATH, "//span[text()='us']//following-sibling::span[@class='pron dpron']").text)
To extract the text /ˈdɪk.ʃən.er.i/ ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use the following locator strategy:
Using XPATH:
driver.get("https://dictionary.cambridge.org/dictionary/english/dictionary") print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[text()='us']//following-sibling::span[@class='pron dpron']"))).get_attribute("innerHTML"))
Console Output:
/ˈdɪk.ʃən.er.i/
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python