I hope everybody is doing good!
I have a task that I do not know pretty well how to solve it. So let me start with the beginning:
- I want to scrap from a web page (a map) details about the streets and associated numbers. For this job, I chose Selenium. I watched some tutorials and different answers but unsuccessfully.
The first part, was to retrieve the streets. This part I managed to accomplish. The page looks like this:
So there is a table with a bunch <tr>
tags from which I managed to retrieve the needed data. However, the things got a bit more complicated once I needed to find a way to retrieve the Numbers.
Numbers of the streets are located in a <select>
tag which APPEARS ONLY when I click on the given tag. So, from the previous image, the structure becomes like this:
But again, only IF I click on the <tr>
tag. If I click another <tr>
tag, the <select>
tag from the previous <tr>
disappears, and pops to the one selected with corresponding <option>
tags.
Now, my question is: How can I iterate through entire table, click on each <tr>
and IF a <select>
tag pops up, to retrieve the numbers associated to each Street?
I would like to store the result in a dictionary where the key is the name of the street, and as values, the numbers associated.
So far, this is the code I wrote, which of course, fails to solve my problem.
UPDATE 3:
driver = webdriver.Chrome(service=Service(path_ex))
driver.get("https://cluj-city.map2web.eu/")
wait = WebDriverWait(driver,30)
# panel = driver.find_element(By.CSS_SELECTOR,"h4#titleStreets")
panel = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,"h4#titleStreets")))
panel.click()
streets = driver.find_elements(By.XPATH,"//table[@id='streets']/tbody/tr/td")
# for i in range(len(streets)):
for i in range(10): # Tried for 10 elements.
streets = driver.find_elements(By.XPATH, "//table[@id='streets']/tbody/tr")
print(streets[i].find_element(By.XPATH,"./td").text)
streets[i].find_element(By.XPATH,"./td").click()
time.sleep(2)
try:
numbers = streets[i].find_elements(By.XPATH,"./td/select/option")
num_list = []
for i in range(len(numbers)):
num_list.append(numbers[i].get_attribute("value"))
print(num_list)
except:
print("No numbers")
Error I get
Traceback (most recent call last):
File "/home/tudor/PycharmProjects/pythonProject/beat.py", line 18, in <module>
panel = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,"h4#titleStreets")))
File "/home/tudor/PycharmProjects/pythonProject/venv/lib/python3.9/site-packages/selenium/webdriver/support/wait.py", line 89, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
CodePudding user response:
Try like below once and confirm.
driver.get("https://cluj-city.map2web.eu/")
wait = WebDriverWait(driver,30)
# panel = driver.find_element(By.CSS_SELECTOR,"h4#titleStreets")
panel = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,"h4#titleStreets")))
panel.click()
streets = driver.find_elements(By.XPATH,"//table[@id='streets']/tbody/tr/td")
# for i in range(len(streets)):
for i in range(10): # Tried for 10 elements.
streets = driver.find_elements(By.XPATH, "//table[@id='streets']/tbody/tr")
print(streets[i].find_element(By.XPATH,"./td").text)
streets[i].find_element(By.XPATH,"./td").click()
time.sleep(2)
try:
numbers = streets[i].find_elements(By.XPATH,"./td/select/option")
num_list = []
for i in range(len(numbers)):
num_list.append(numbers[i].get_attribute("value"))
print(num_list)
except:
print("No numbers")
Output
Aleea Albăstrelelor
[]
Aleea Alexandru Lapedatu
['0', '672723', '673946', '672724', '686707', '685369', '674092', '685370', '674093', '685371', '685379', '686708', '674094', '686710', '686711']
Aleea Anemonelor
['0', '674371']
Aleea Azaleelor
['0', '686876', '686847', '667291', '669662', '686877', '686878']
Aleea Azuga
['0', '675419', '661994', '676263', '662870', '666294', '665524', '675422', '662090']
Aleea Azur
[]
Aleea Băişoara
['0', '675416', '685313', '662861', '673532', '665446', '675426', '678804', '663696', '661674', '661988', '672374', '675431']
Aleea Băiţa
['0', '666522', '678447', '661990', '662881', '661992', '661993', '665375', '662880', '661987', '661995', '676267']
Aleea Bâlea
['0', '676268', '676266', '662865', '662866', '663658', '665715', '686000', '686014', '686012', '686013', '686015', '673391', '686019', '686020', '686007', '686801', '686010', '662878']
Aleea Bârsei
['0', '663687', '675417', '662871', '676265', '663692']