Home > Net >  How to get href link from html tag and also the data inside the link using selenium
How to get href link from html tag and also the data inside the link using selenium

Time:09-05

I am new to selenium. I am practicing selenium with this link: https://www.cagematch.net/?id=1&view=cards&year=2022&Day=&Month=&Year=2022&name=&promotion=1&showtype=Pay Per View

Here you can see the upcoming events. After clicking one of them It will show the matches. So, I want to fetch the links then also fetch the matches one by one. Like: At clash at castle there will be 5 matches, then on the next event there 5 matches. So, I want to fetch the list of events also the matches inside the events.

I have tried a little bit as, I am very new in selenium, can't figure out the idea to solve this issue. Here is my little code where I tried to do something but don't know what to do...

from selenium import webdriver
import os
from selenium.webdriver.common.by import By
import constant as const


class CageMatch(webdriver.Chrome):
    def __init__(self, path=r"/usr/local/bin/SeleniumDriver/", teardown=False):
        self.path = path
        self.teardown = teardown
        os.environ["PATH"]  = self.path
        super(CageMatch, self).__init__()
        self.maximize_window()

    def __exit__(self, exc_type, exc_val, exc_tb):
        if self.teardown:
            self.quit()

    def find_web_page(self):
        self.get(const.EVENTS_URL)

    def extract_urls_from_td(self):
        pass

As you can see I have a function called extract_urls_from_td.. I want this function to fetch all events href then it will go inside all the href then fetch the details. So what will be the easiest way of doing this?

CodePudding user response:

This can be done as following:

  1. Close the bottom cookies banner.
  2. Get the links for all the matches.
  3. Inside each match do what you need to do there. I have printed the matches titles for example.
  4. get back to the main page and grab again the links to the matches since when we navigated to specific match page the previously grabbed links become stale.
  5. To make the code smooth and stable I used WebDriverWait explicit waits.
    The code below works. The out put presented as well.
import time
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

options = Options()
options.add_argument("--start-maximized")

s = Service('C:\webdrivers\chromedriver.exe')

driver = webdriver.Chrome(options=options, service=s)

url = 'https://www.cagematch.net/?id=1&view=cards&year=2022&Day=&Month=&Year=2022&name=&promotion=1&showtype=Pay Per View'

wait = WebDriverWait(driver, 10)
driver.get(url)

wait.until(EC.element_to_be_clickable((By.ID, "cookiedingsbumsCloser"))).click()
wait.until(EC.element_to_be_clickable((By.XPATH, "//tr[contains(@class,'TRowCard')]//a[2]")))
time.sleep(0.5)
links = driver.find_elements(By.XPATH, "//tr[contains(@class,'TRowCard')]//a[2]")
for idx, link in enumerate(links):
    wait.until(EC.element_to_be_clickable(link)).click()
    print("link "   str(idx 1)   " matches:")
    try:
        titles = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.MatchType")))
        for title in titles:
            print(title.text)
    except:
        print("No matches found")
    print(" ")
    driver.back()
    wait.until(EC.element_to_be_clickable((By.XPATH, "//tr[contains(@class,'TRowCard')]//a[2]")))
    time.sleep(0.5)
    links = driver.find_elements(By.XPATH, "//tr[contains(@class,'TRowCard')]//a[2]")
link 1 matches:
Tag Team Match
Six Man Tag Team Match
Singles Match
WWE SmackDown Women's Title Match
WWE Intercontinental Title Match
WWE Title / WWE Universal Title Match
 
link 2 matches:
WWE NXT North American Title Match
WWE NXT Tag Team Title / WWE NXT UK Tag Team Title Elimination Fatal Four Way Unification Match
WWE NXT Women's Tag Team Title Match
WWE NXT Women's Title / WWE NXT UK Women's Title Unification Match
WWE NXT Title / WWE NXT United Kingdom Title Unification Match
 
link 3 matches:
No matches found
 
link 4 matches:
No matches found
 
link 5 matches:
No matches found
 

Process finished with exit code 0
  • Related