My problem is that I am trying to find a way to get the link of youtube thumbnails using selenium. What I found online does not help at all it suggested me to do: .get_attribute("src")' which does not work.
I tried this (everything works if I remove '.get_attribute("src")' *well, I do not get any errors and I am not capable of getting the thumbnails either):
import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.get("https://www.youtube.com/@MrBeast/videos")
SCROLL_PAUSE_TIME = 3
last_height = driver.execute_script("return document.documentElement.scrollHeight")
n=0
while n<4:
#Scroll down to bottom
driver.execute_script("window.scrollTo(0, arguments[0]);", last_height);
time.sleep(SCROLL_PAUSE_TIME)
new_height = driver.execute_script("return document.documentElement.scrollHeight")
if new_height == last_height:
break
last_height = new_height
n = 1
titles = driver.find_elements(By.ID, "video-title")
views = driver.find_elements(By.XPATH, '//*[@id="metadata-line"]/span[1]')
year = driver.find_elements(By.XPATH,'//*[@id="metadata-line"]/span[2]')
thumbnail = driver.find_elements(By.XPATH, '//*[@id="thumbnail"]/yt-image/img').get_attribute("src")
data = []
for i,j,k,l in zip(titles, views, year, thumbnail):
data.append([i.text, j.text, k.text, l.text])
df = pd.DataFrame(data, columns = ['Title', 'views', 'date', 'thumbnail'])
df.to_csv('MrBeastThumbnails.csv')
driver.quit()
CodePudding user response:
find_elements
returns a list of web elements while .get_attribute()
can be applied on single web element object only.
To get the src
attribute values you need to iterate over a list of web elements extracting their src
attributes, as following:
src_values = []
thumbnails = driver.find_elements(By.XPATH, '//*[@id="thumbnail"]/yt-image/img')
for thumbnail in thumbnails:
src_values.append(thumbnail.get_attribute("src"))