Home > Mobile >  Get 'src' link from image using selenium
Get 'src' link from image using selenium

Time:01-28

My problem is that I am trying to find a way to get the link of youtube thumbnails using selenium. What I found online does not help at all it suggested me to do: .get_attribute("src")' which does not work.

I tried this (everything works if I remove '.get_attribute("src")' *well, I do not get any errors and I am not capable of getting the thumbnails either):

import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.by import By


driver = webdriver.Chrome()
driver.get("https://www.youtube.com/@MrBeast/videos")

SCROLL_PAUSE_TIME = 3

last_height = driver.execute_script("return document.documentElement.scrollHeight")
n=0
while n<4:
    #Scroll down to bottom
    driver.execute_script("window.scrollTo(0, arguments[0]);", last_height);
    time.sleep(SCROLL_PAUSE_TIME)
    
    new_height = driver.execute_script("return document.documentElement.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height
    n  = 1
titles = driver.find_elements(By.ID, "video-title")
views = driver.find_elements(By.XPATH, '//*[@id="metadata-line"]/span[1]')
year = driver.find_elements(By.XPATH,'//*[@id="metadata-line"]/span[2]')
thumbnail = driver.find_elements(By.XPATH, '//*[@id="thumbnail"]/yt-image/img').get_attribute("src")

data = []
for i,j,k,l in zip(titles, views, year, thumbnail):
    data.append([i.text, j.text, k.text, l.text])
df = pd.DataFrame(data, columns = ['Title', 'views', 'date', 'thumbnail'])
df.to_csv('MrBeastThumbnails.csv')

driver.quit()

CodePudding user response:

find_elements returns a list of web elements while .get_attribute() can be applied on single web element object only.
To get the src attribute values you need to iterate over a list of web elements extracting their src attributes, as following:

src_values = []
thumbnails = driver.find_elements(By.XPATH, '//*[@id="thumbnail"]/yt-image/img')
for thumbnail in thumbnails:
    src_values.append(thumbnail.get_attribute("src"))    
  • Related