Home > other >  Python Selenium find_elements in multiple pages give the same result all the time
Python Selenium find_elements in multiple pages give the same result all the time

Time:02-24

I have a simple exmaple of the problem. I run driver.find_elements(By.ID, "thumbnail") It works, I click on a random element and I rescrape the info again in a loop and the 2nd time, i always get the exact same results:

driver.get("https://www.somepage.com")
time.sleep(7)
items = []

for i in range(3):
    print("LOOP #: "   str(i))
    random_number = random.randint(1, 5)
    items = driver.find_elements(By.ID, "thumbnail")
    url = i.get_attribute("href")
    print(str(url))
    items[random_number].click()
    time.sleep(100)

OUTPUT

LOOP #: 0
URL 1
URL 2
URL 3
URL 4
LOOP #: 1
URL 1
URL 2
URL 3
URL 4
LOOP #: 2
URL 1
URL 2
URL 3
URL 4

The second loop should have different URL. The find_elements(By.ID, "thumbnail") still applies

I don't know what I'm doing wrong. I even tried to add items.clear() at the end of the loop, same result.

CodePudding user response:

The below answer pertains to YouTube since it was given as an example when asked. When YouTube opens, it would have thumbnail id, and there are a good collection of those thumbnails. So the strategy is to iterate in rhe range of 3 and in that loop, for each iteration, collect all the elements with the id thumbnail and select a random one and fetch it's href and then click on it. The question here now is how to reiterate: There are 2 options: (1) Continue with the click and select one of the options (thumbnail i suppose) from the left pane, or, (2) click on homepage (YouTube icon) and then again continue the iteration process.

I went with the 2nd option, and here it the code for it:

driver.get('https://www.youtube.com/')
for i in range(3):
    print("LOOP #: "   str(i))
    time.sleep(10)
    items = driver.find_elements(By.ID, "thumbnail")
    # here, instead of selecting from the items, you are trying to fetch the attribute from i, which is not an element at all and it didn't work for me.
    # I , instead, fetched the href from items stored it in a variable, and clicked on it, then clicked on homepage and reiterated the process
    rand = random.choice(items)
    print(rand.get_attribute('href'))
    rand.click()
    time.sleep(3)
    driver.find_element(By.XPATH, "(//*[@title='YouTube Home'])[1]").click()
driver.quit()

Output:

LOOP #: 0
https://www.youtube.com/watch?v=YIKz49-aGas
LOOP #: 1
https://www.youtube.com/watch?v=51Qs0Ej2RUc
LOOP #: 2
https://www.youtube.com/watch?v=OeShsZPOP-s

Process finished with exit code 0

Note: You may replace time.sleep with better explicit wait like webdriverwait if you wish for a robust code. Having said that, YouTube being a Google property, would have a great randomization in the element attributes and gets flaky often. Also, the bot would get detected if there are too many requests.

UPDATED ANSWER:

Updated answer to click on the right pane thumbnails after clicking on the front page thumbnail

driver.maximize_window()
driver.get('https://www.youtube.com/')
time.sleep(10)
items = driver.find_elements(By.ID, "thumbnail")
rand = random.choice(items)
print(rand.get_attribute('href'))
rand.click()
time.sleep(3)
for i in range(3):
    print(f"Loop#: {str(i)}")
    WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.ID, "movie_player")))
    yt_left_pane_items = driver.find_elements(By.XPATH, "//*[@id='items']//*[@id='thumbnail']")
    rand_left_pane = random.choice(yt_left_pane_items)
    print(rand_left_pane.get_attribute('href'))
    rand_left_pane.click()
    time.sleep(5)
driver.quit()

Imports:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

Output:

https://www.youtube.com/watch?v=9YSbflKeOZQ
Loop#: 0
https://www.youtube.com/watch?v=ENOEgKeI_D0
Loop#: 1
https://www.youtube.com/watch?v=PeByUAhHXqs
Loop#: 2
https://www.youtube.com/watch?v=GUHfY84weMw

Process finished with exit code 0
  • Related