Home > front end >  Get Youtube video title using classname and text attribute using Selenium and Python
Get Youtube video title using classname and text attribute using Selenium and Python

Time:11-25

Hi I'm using Python Selenium Webdriver to get Youtube title but keep getting more info than I'd like. The line is: driver.find_element_by_class_name("style-scope ytd-video-primary-info-renderer").text

Is there any way to fix it and make it more efficient so that it displays only the title. Here is the test script Im using:

from selenium import webdriver as wd
from time import sleep as zz

driver = wd.Firefox(executable_path=r'./geckodriver.exe')
driver.get('https://www.youtube.com/watch?v=wma0szfIafk')
zz(4)
test_atr = driver.find_element_by_class_name("style-scope ytd-video-primary-info-renderer").text
print(test_atr)

CodePudding user response:

To print the title text OBI-WAN KENOBI Official Trailer (2022) Teaser you can use either of the following Locator Strategies:

  • Using css_selector and get_attribute("innerHTML"):

    print(driver.find_element(By.CSS_SELECTOR, "h1.title.style-scope.ytd-video-primary-info-renderer > yt-formatted-string.style-scope.ytd-video-primary-info-renderer").get_attribute("innerHTML"))
    
  • Using xpath and text attribute:

    print(driver.find_element(By.XPATH, "//h1[@class='title style-scope ytd-video-primary-info-renderer']/yt-formatted-string[@class='style-scope ytd-video-primary-info-renderer']").text)
    

Ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR and text attribute:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "h1.title.style-scope.ytd-video-primary-info-renderer > yt-formatted-string.style-scope.ytd-video-primary-info-renderer"))).text)
    
  • Using XPATH and get_attribute("innerHTML"):

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h1[@class='title style-scope ytd-video-primary-info-renderer']/yt-formatted-string[@class='style-scope ytd-video-primary-info-renderer']"))).get_attribute("innerHTML"))
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • Console Output:

    OBI-WAN KENOBI Official Trailer (2022) Teaser
    

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python


References

Link to useful documentation:

  • Related