Home > database >  How to use Python's Selenium to scrape search results?
How to use Python's Selenium to scrape search results?

Time:03-20

I am trying to scrape the results of Superhero battles created at the website https://www.superherodb.com/battle/create/

I've already scraped the list of all the superheroes and their stats from the website, now I want to enter their names and see who will win in the battle. I want to run a variation of each superhero vs everyone else. E.g Superman vs Thor, then Superman vs Spiderman, etc...

My list of characters and their info:


Characters = ["Superman", "Thor", "Spider-Man"]
Names = ["Kal-El", "Thor Odinson", "Peter Parker"]
Universes = ["Prime Earth", "Earth-616", "Earth-616"] 

My ultimate goal is to make a for loop that runs through each character vs all other characters in the list, but the list is inaccessible to me from Selenium.

for i in range(len(Characters)):

driver = webdriver.Chrome(executable_path='C:\Webdriver\chromedriver.exe')
driver.get('https://www.superherodb.com/battle/create')

add = driver.find_element_by_xpath("//*[@id='team1']/div/a") 
add.click()

text_area = driver.find_element_by_xpath("//*[@id='s']") #team1 searchbar 
text_area.send_keys(Characters [i]) 


contents = driver.find_element_by_xpath("//*[@id='quickselect_result']")

     if for elem in contents.find_elements_by_xpath('.//span[@class = "suffix level-1"]') == Names [i] and elem in browser.find_elements_by_xpath('.//span[@class = "suffix level-1"]') == Universes[i]:

#Here I want it to select that superhero in the team1 section. And replicate it to team2.

I want the for loop to also check for the specific version Character (e.g. Superman) with matching Name (Kal-El) and Universe (Prime Earth) and enter it in team1, also repeat this entry for team2 for the other Superhero and the page result will be for example: https://www.superherodb.com/superman-vs-thor/90-103/

I figured out this part. To figure out the result of the battle after entering both Superheroes in team1 and team2, I will scrape using this method:

wait=WebDriverWait(driver,10)
urls=['https://www.superherodb.com/superman-vs-thor/90-103/']
names=['Superman_vs_Thor']
complete_list={}
for indx,url in enumerate(urls):
    driver.get(url)
    battles=[]
    try:
        win=wait.until(EC.visibility_of_element_located((By.XPATH, "//div[@class='battle-team-result win']"))).text
        battles.append(win)
    except:
        pass
    try:
        draw=wait.until(EC.visibility_of_element_located((By.XPATH, "//div[@class='battle-team-result draw']"))).text
        battles.append(draw)
    except:
        pass
    try:
        loss=wait.until(EC.visibility_of_element_located((By.XPATH, "//div[@class='battle-team-result lose']"))).text
        battles.append(loss)
    except:
        pass
    complete_list[names[indx]]= battles

print(complete_list)

Which gives me:

{'Superman_vs_Thor': ['912 wins (52%)', '35 (2%)', '806 wins (46%)']}

But I am stuck in the part of selecting the specific version of the character with their names and universes in both team1 and team2 and then creating the battle and viewing the battle's page to scrape it for the results because the list is not appearing in my code.

Menu for selecting Superhero

I am stuck here, after entering the Superhero, how do I access the list and select the superhero with his right name and universe.

CodePudding user response:

Here is the code that selects character, you have to add a loop, character rotation and print the results

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from time import sleep


#Characters = ["Superman", "Thor", "Spider-Man"]
#options: (superman vs thor) (superman vs spider man) (spider man vs thor)

char_1 = 'superman'
char_2 = 'spider'

driver = webdriver.Chrome('C:/chromedriver.exe')
driver.get('https://www.superherodb.com/battle/create')
wait = WebDriverWait(driver, 5)

wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="qc-cmp2-ui"]/div[2]/div/button[3]'))).click()
driver.execute_script("window.scrollTo(0, 300)")

#sleep(999)


def team1():
    wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="team1"]/div/a'))).click() # add member team 1
    wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="f_pub"]'))).click() # universe dropdown list team 1
    wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="f_pub"]/option[2]'))).click() # dc comics team 1
    wait.until(EC.visibility_of_element_located((By.NAME, 'quickselect'))).send_keys(char_1) # write character name team 1

    wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="quickselect_result"]/li[2]'))).click() # superman kal el team 1
    wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="add_member"]/div/div[3]/a'))).click() # done button team 1
    return


def team2():
    wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="team2"]/div/a'))).click() # add member team 2
    wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="f_pub"]'))).click() # universe dropdown list team 2
    wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="f_pub"]/option[3]'))).click() # marvel team 2
    wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="s"]'))).clear() # clear character dropdown list team 2
    wait.until(EC.visibility_of_element_located((By.NAME, 'quickselect'))).send_keys(char_2) # write character name team 2
    sleep(1)
    wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="quickselect_result"]/li[1]'))).click() # spiderman peter parker team 2
    wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="add_member"]/div/div[3]/a'))).click() # done button team 2
    return

def battle_result():
    wait.until(EC.visibility_of_element_located((By.NAME, 'battle_start'))).click()
    #your turn


team1()
team2()
battle_result()
  • Related