Home > database >  Selenium on Colab: Wrong results from text extraction in Python
Selenium on Colab: Wrong results from text extraction in Python

Time:01-17

When I am using Chromedriver in Python to scrape a certain website I only get wrong results if I am running the script via Colab. If I am using Spyder for example everything seems to work fine.

It seems to me, that Selenium is still finding the right elements, but it extracts weird numbers, which I can find nowhere on the website.

Website with results: Website with desired numbers (https://i.stack.imgur.com/bEKbh.png)

What Colab returns: Results from Colab(https://i.stack.imgur.com/1EbOM.png)

Website: "https://www.oddsportal.com/soccer/croatia/hnl/hnk-gorica-varazdin-Kr4sLgwt/#1X2;2"

I am using this function to test:

def fi(a):
try:
    driver.find_element("xpath", a).text
except:
    return False

And this one to get the text:

def ffi(a):
if fi(a) != False :
    return driver.find_element("xpath", a).text

And this is the full code

driver.get("https://www.oddsportal.com/soccer/croatia/hnl/hnk-gorica-varazdin-Kr4sLgwt/#1X2;2")

for j in range(1,15):
print(j)
book= ffi('((//*[starts-with(@class,"flex text-xs max")])[{}]//p)[1]'.format(j))

if fi('((//*[starts-with(@class,"flex text-xs max")])[{}]//p)[2]//preceding-sibling::a'.format(j))==False:
    Odd_1=ffi('((//*[starts-with(@class,"flex text-xs max")])[{}]//p)[2]'.format(j))
else:
    Odd_1=fi('((//*[starts-with(@class,"flex text-xs max")])[{}]//a)[5]'.format(j))
    
if fi('((//*[starts-with(@class,"flex text-xs max")])[{}]//p)[3]//preceding-sibling::a'.format(j))==False:
    Odd_X=ffi('((//*[starts-with(@class,"flex text-xs max")])[{}]//p)[3]'.format(j))
else:
    Odd_X=ffi('((//*[starts-with(@class,"flex text-xs max")])[{}]//a)[6]'.format(j))       

if fi('((//*[starts-with(@class,"flex text-xs max")])[{}]//p)[4]//preceding-sibling::a'.format(j))==False:
    Odd_2=ffi('((//*[starts-with(@class,"flex text-xs max")])[{}]//p)[4]'.format(j))
else:
    Odd_2=ffi('((//*[starts-with(@class,"flex text-xs max")])[{}]//a)[7]'.format(j))

ab= (ffi('//div[contains(@class,"flex items-center w-full h-auto")]//p'))
bc=(ffi('(//div[contains(@class,"flex px")]//child::div)[3]'))       
print(book, Odd_1, Odd_X, Odd_2,ab ,bc)

Once again it is working fine on spyder

CodePudding user response:

Your IP and language identifictors might change when using Google Colab. This might cause inconsistency or different page layouts.

Try screenshotting and getting the page-source and compare to your local tests. Also ckeck, if you locator still matches up with the page-sourcode in colab

Also make shure, that you're using the same browser.

  • Related