Home > Net >  How to extract the value of the src attribute using Selenium
How to extract the value of the src attribute using Selenium

Time:03-26

I have problems with something I don't even know the name.

I'm trying to reach the link next to where it says src= "LINK" with selenium.

I have class name = tWeCI, I guess I need to achieve using this but I have no idea how to do it.

Code trials:

from selenium import webdriver
from selenium.webdriver.common.by import By

browser = webdriver.Chrome()
browser.get("https://www.instagram.com/p/CYMHMEGBSRT/")
a = browser.find_element(by=By.CLASS_NAME, value="tWeCl")

Output:

https://instagram.fist4-1.fna.fbcdn.net/v/t50.2886-16/271301182_1082110899295465_6686845989868801609_n.mp4?efg=eyJ2ZW5jb2RlX3RhZyI6InZ0c192b2RfdXJsZ2VuLjcyMC5jbGlwcy5iYXNlbGluZSIsInFlX2dyb3VwcyI6IltcImlnX3dlYl9kZWxpdmVyeV92dHNfb3RmXCJdIn0&_nc_ht=instagram.fist4-1.fna.fbcdn.net&_nc_cat=101&_nc_ohc=6RJahYZ39DIAX_ul9E4&edm=AABBvjUBAAAA&vs=453007466354843_3354420889&_nc_vs=HBksFQAYJEdENjZLeERwbE1LVExOZ0RBRWtPNE5YLWNzeGNicV9FQUFBRhUAAsgBABUAGCRHS0ZaS1JCMldIUXZxRzhCQUJ4bFB2ZnVrbE1hYnFfRUFBQUYVAgLIAQAoABgAGwAVAAAmqN/Frpms4z8VAigCQzMsF0A+mZmZmZmaGBJkYXNoX2Jhc2VsaW5lXzFfdjERAHX+BwA=&_nc_rid=c8bb85b1e6&ccb=7-4&oe=62408107&oh=00_AT-E77LrFH7G6VVXGeh8bRFQsu95hhQlqUGZdinzFBIUYQ&_nc_sid=83d603"

Snapshot of the HTML:

pic1

CodePudding user response:

To print the value of the href attribute you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR:

    driver.get('https://www.instagram.com/p/CYMHMEGBSRT/')
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "video.tWeCl"))).get_attribute("src"))
    
  • Using XPATH:

    driver.get('https://www.instagram.com/p/CYMHMEGBSRT/')
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//video[@class='tWeCl']"))).get_attribute("src"))
    
  • Console Output:

    https://instagram.fpnq13-2.fna.fbcdn.net/v/t50.2886-16/271301182_1082110899295465_6686845989868801609_n.mp4?efg=eyJ2ZW5jb2RlX3RhZyI6InZ0c192b2RfdXJsZ2VuLjcyMC5jbGlwcy5iYXNlbGluZSIsInFlX2dyb3VwcyI6IltcImlnX3dlYl9kZWxpdmVyeV92dHNfb3RmXCJdIn0&_nc_ht=instagram.fpnq13-2.fna.fbcdn.net&_nc_cat=101&_nc_ohc=6RJahYZ39DIAX-Iphu9&edm=AABBvjUBAAAA&vs=453007466354843_3354420889&_nc_vs=HBksFQAYJEdENjZLeERwbE1LVExOZ0RBRWtPNE5YLWNzeGNicV9FQUFBRhUAAsgBABUAGCRHS0ZaS1JCMldIUXZxRzhCQUJ4bFB2ZnVrbE1hYnFfRUFBQUYVAgLIAQAoABgAGwAVAAAmqN/Frpms4z8VAigCQzMsF0A+mZmZmZmaGBJkYXNoX2Jhc2VsaW5lXzFfdjERAHX+BwA=&_nc_rid=34a08d5a01&ccb=7-4&oe=62408107&oh=00_AT9sh_BH__zjeReDB7lde4t3avzYqDjimTJRnoZi6Lj-TQ&_nc_sid=83d603
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • Related