I am trying to scrape information from a career website of a company. I want to get the reference code of the respective job ad.
I want to use Selenium and tried to identify the job posting code with xpath. When I run the code a google Chrom window opens and uses the correct web address:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import pandas as pd
PATH = "C:/Users/MyUser/Desktop/Driver/chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://www.uke.jobs/sap(bD1kZSZjPTUwMA==)/bc/bsp/kwp/bsp_eui_rd_uc/main.do?action=to_uc_search")
driver.maximize_window()
ref_code = driver.find_elements_by_xpath("//tr[@data-eui-handler=\"{event:'click',handler:'eui.app.controller.search_results.selectRow'}\"]/td[1]")
print(len(ref_code))
User_input = input()
When running the code it takes for ever and I get the following results:
DevTools listening on ws://127.0.0.1:52187/devtools/browser/7300c3d2-42d1-4f8e-a136-4e1ce37bcb87
c:\Users\MyUser\Desktop\PyhtonVisStuCo\Selenium.py:15: DeprecationWarning: find_elements_by_xpath is deprecated. Please use find_elements(by=By.XPATH, value=xpath) instead
ref_code = driver.find_elements_by_xpath("//tr[@data-eui-handler=\"{event:'click',handler:'eui.app.controller.search_results.selectRow'}\"]/td[1]")
0
[3516:18308:0609/194039.395:ERROR:device_event_log_impl.cc(214)] [19:40:39.395] Bluetooth: bluetooth_adapter_winrt.cc:1074 Getting Default Adapter failed.
What am I doing wrong?
Thanks
Pat
CodePudding user response:
To extract the texts from the Referenzcode column you can use List Comprehension and you can use either of the following locator strategies:
Using CSS_SELECTOR:
driver.get("https://www.uke.jobs/sap(bD1kZSZjPTUwMA==)/bc/bsp/kwp/bsp_eui_rd_uc/main.do?action=to_uc_search") print([my_elem.text for my_elem in driver.find_elements(By.CSS_SELECTOR, "table#table_search_results tr[data-head] td:first-of-type")])
Using XPATH:
driver.get("https://www.uke.jobs/sap(bD1kZSZjPTUwMA==)/bc/bsp/kwp/bsp_eui_rd_uc/main.do?action=to_uc_search") print([my_elem.text for my_elem in driver.find_elements(By.XPATH, "//table[@id='table_search_results']//tr[@data-head]/td")])
Console Output:
['ZVW22192', 'ZPF2208_ex', 'ZPF2207_e', 'ZPF2206_e', 'ZMF2249', 'ZIT22484', 'ZIT22444', 'ZIT22380', 'ZIT22379', 'WS22536']