Home > Net >  Unable to scrape the src if it is nested inside the source tag inside video via python selenium and
Unable to scrape the src if it is nested inside the source tag inside video via python selenium and

Time:11-27

I was scraping an anime website as a project but when I tried to scrape the src it gave me an error. The src is nested inside the source tag. I am giving the screenshot and code below.

example screenshot

Code :

    from selenium import webdriver
    from selenium.webdriver.common.keys import Keys
    from bs4 import BeautifulSoup
    import re
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC

#launch url
url = "https://bestdubbedanime.com/Demon-Slayer-Kimetsu-no-Yaiba/26"

# create a new Firefox session
driver = webdriver.Firefox()
# driver.implicitly_wait(30)
driver.get(url)

# python_button = driver.find_element_by_class_name('playostki') #FHSU
# python_button.click() #click fhsu link

  soup1 = BeautifulSoup(driver.page_source, 'html.parser')

  video = soup1.find('video', id='my_video_1_html5_api')
  # video = driver.find_element_by_id('my_video_1_html5_api')
  WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".playostki"))).click()      
   driver.stop_client
   driver.close
   driver.quit

CodePudding user response:

The reason why you are not getting the src tag, because it is displayed after clicking the video. You have to first click on that video, and then try to find the attribute "src" from the element.

driver.maximize_window()
driver.get("https://bestdubbedanime.com/Demon-Slayer-Kimetsu-no-Yaiba/26")
WebDriverWait(driver, 60).until(EC.visibility_of_element_located((By.XPATH,  "//div[@class='playostki']//img"))).click()
print(WebDriverWait(driver, 60).until(EC.presence_of_element_located((By.CSS_SELECTOR, "#my_video_1_html5_api > source"))).get_attribute("src"))
driver.quit()

Output:

https://bestdubbedanime.com/xz/api/v.php?u=eVcxb0ZCUEMraFd1Vi9pM2xqWUhtbXZMWjZ0Mlpoc1U0Tmhqc2VFcVViQUc3VUVhR0pZV1EvaW1nY1duaXBMeXYvUUY4RG5ab3p4MEtEMUFHRmVaN0taVG9sY3ZVcTRoeDZoVHhWLzdiYjQ5UStNN2FYSjJBSWNKL0t5S1hLNGEyVlZqV1BYQ2MwaCsyNWcvak1Db01EMnNtWGwwTTBBVld4MkNER0V3eGNCRXJ0cEY4RHFPclhwbTJpWFBPSmJI
  • Related