Home > Mobile >  xpath works on chrome console, but does not work on selenium
xpath works on chrome console, but does not work on selenium

Time:08-20

Here is the screenshot of the HTML structure for the page I am trying to scrape. enter image description here

You can see that there is a <table> element with . When I use the xpath //table[@class='waffle'] on chrome console, it works as expected:

enter image description here

However when I use the same path on Selenium it doesn't work.

  container_xpath = "//table[@class='waffle']"
  # wait
  try:
    wait = WebDriverWait(driver, 30)
    container = wait.until(EC.presence_of_element_located((By.XPATH, container_xpath)))
    print('container found')
  except Exception as e:
    print('container not found')
    raise PageDidNotLoadError
  return

The python script prints "container not found".

What is wrong with selenium?

CodePudding user response:

Its a common practice to hide the elements under nested iframe. You need to switch to the outer iframe first and then to the inner frame. The below code should work for you

# Switch to outer iframe
oframe = driver.find_element(By.CSS_SELECTOR, 'iframe')
driver.switch_to.frame(oframe)
# Switch to nested frame
iframe = driver.find_element(By.CSS_SELECTOR, 'iframe#pageswitcher-content')
driver.switch_to.frame(iframe)

# get the container
container = wait.until(EC.presence_of_element_located((By.XPATH, container_xpath)))

To get the same in a table form you can do

import pandas as pd

table = pd.read_html(container.get_attribute('outerHTML'))
Unnamed: 0 Unnamed: 1 Unnamed: 2 Unnamed: 3 Unnamed: 4 Unnamed: 5 Unnamed: 6
0 1 カード名 仕様 レア 型番 タイプ 状態A
1 nan nan nan nan nan nan nan
2 2 nan nan nan nan nan nan
3 3 【スペシャルアート(TAG TEAM GX)】 nan nan nan nan nan
4 4 フシギバナ&ツタージャGX SA SR 066/064 3300
5 5 セレビィ&フシギバナGX SA SR 097/095 3500
6 6 モクロー&アローラナッシーGX SA SR 056/054 3300
7 7 フェローチェ&マッシブーンGX SA SR 056/054 2300
8 8 レシラム&リザードンGX SA SR 097/095 20000
9 9 リザードン&テールナーGX SA SR 068/064 6000
10 10 カメックス&ポッチャマGX SA SR 070/064 5000
11 11 コイキング&ホエルオーGX SA SR 099/095 5500
12 12 ヤドン&コダックGX SA SR 096/094 4000
13 13 ピカチュウ&ゼクロムGX SA SR 101/095 30000
14 14 ライチュウ&アローラライチュウGX SA SR 057/054 5500

CodePudding user response:

<iframe style="border-width: 2px; border-style: solid; border-color: red; width: 1000px; height: 200000px;" src="https://docs.google.com/spreadsheets/d/e/2PACX-1vQT3Q9qDbZUpnP3_WH2I5qw8O-U_PqXVhhoIzH2o-tSzeDND9FTuoGKbZiNHTbrzTgKAUA2_SvXFh_2/pubhtml?gid=159569114&amp;single=true&amp;widget=true&amp;headers=false&amp;gid=0&amp;range=A:F" width="320" height="240"></iframe>    

<iframe id="pageswitcher-content" frameborder="0" marginheight="0" marginwidth="0" src="https://docs.google.com/spreadsheets/d/e/2PACX-1vQT3Q9qDbZUpnP3_WH2I5qw8O-U_PqXVhhoIzH2o-tSzeDND9FTuoGKbZiNHTbrzTgKAUA2_SvXFh_2/pubhtml/sheet?headers=false&amp;gid=159569114&amp;range=A:F" style="display: block; width: 100%; height: 100%;"></iframe>

You need to switch to the inner iframe after switching to the outer one.

WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe#pageswitcher-content")))

Imports:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC
  • Related