I have to following HTML that is on a website I'm trying to scrape:
<div class="test-section-container">
<div>
<span class="test-section-title">Section Title</span>
<div style="display: inline-block; padding: 0.05rem;"></div>
</div>
<div style="cursor: pointer; background-color: rgb(248, 248, 248); display: flex; line-height: 1.2; margin-bottom: 0.07rem;">
<div style="width: 0.5rem; flex-shrink: 0; background-color: rgb(245, 222, 136);"></div>
<div style="padding: 0.07rem; overflow: hidden;">
<div style="font-size: 0.18rem; text-overflow: ellipsis; overflow: hidden; white-space: nowrap;">Newsletter 1</div>
<div style="font-size: 0.13rem; color: rgb(102, 102, 102);">2021 11 8</div>
</div>
</div>
<div style="cursor: pointer; background-color: rgb(248, 248, 248); display: flex; line-height: 1.2; margin-bottom: 0.07rem;">
<div style="width: 0.5rem; flex-shrink: 0; background-color: rgb(221, 221, 221);"></div>
<div style="padding: 0.07rem; overflow: hidden;">
<div style="font-size: 0.18rem; text-overflow: ellipsis; overflow: hidden; white-space: nowrap;">Newsletter 2 </div>
<div style="font-size: 0.13rem; color: rgb(102, 102, 102);">2021 11 3</div>
</div>
</div>
This is the selenium/python code that I'm using:
driver.get("http://www.testwesbite.org/#/newsarticles")
results = driver.find_elements_by_class_name('test-section-container')
texts = []
for result in results:
text = result.text
texts.append(text)
print(text)
This gives me an output off:
Newsletter 1
2021 11 8
Newsletter 2
2021 11 3
If I use the following code:
first_result = results[0]
first_result.click()
It does click into the first article but a results[1]
give me an out of bounds error.
How would I go about click on the second article?
CodePudding user response:
As you have used driver.find_elements_by_class_name('test-section-container')
all the following texts:
- Newsletter 1
- 2021 11 8
- Newsletter 2
- 2021 11 3
Are within the results[0]
element and results[1]
desn't exists. Hence you face out of bounds error
Solution
To click on each results[0]
and results[1]
you can use:
driver.get("http://www.testwesbite.org/#/newsarticles")
results = driver.find_elements(By.CSS_SELECTOR, "div.test-section-container div[style*='nowrap']")
texts = []
for result in results:
text = result.text
texts.append(text)
print(text)
Now you can click the individual items as:
first_result = results[0]
first_result.click()
and
second_result = results[1]
second_result.click()
Note: You have to add the following imports :
from selenium.webdriver.common.by import By