My code accesses a webpage, and wants to pull each row of information, however it pulls blank.
Expected output = Print title of each row.
Currently, it just prints out blank for me.
import time
import requests
from selenium import webdriver
driver = webdriver.Chrome()
bracket=[]
url='https://www.sabcs.org/Program/Poster-Sessions/Poster-Session-1'
driver.get(url)
time.sleep(3)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
r=requests.get(url)
page_source=r.content
each_field=driver.find_elements_by_xpath(".//tr[@class='normaltext']")
for item in each_field:
print(item.text)
CodePudding user response:
There's an <iframe>
tag that you need to switch to. Also, I'd just use pandas here to parse the table.
from selenium import webdriver
import pandas as pd
driver = webdriver.Chrome()
bracket=[]
url='https://www.sabcs.org/Program/Poster-Sessions/Poster-Session-1'
driver.get(url)
driver.switch_to.frame(driver.find_elements_by_xpath(".//iframe")[-1])
df = pd.read_html(driver.page_source)[0]
Output:
print(df)
0 1
0 NaN NaN
1 Poster Session 1 – Wednesday, December 8, 2021... Poster Session 1 – Wednesday, December 8, 2021...
2 NaN NaN
3 NaN Axillary Staging and Sentinel Nodes
4 P1-01-01 Prospective ultrasonographic surveillance stud...
.. ... ...
279 P1-24-04 Spatially resolved cell type heterogeneity unc...
280 P1-24-05 Breast conserving surgery for non-metastatic i...
281 P1-24-06 Risk factor modeled microenvironment effects l...
282 P1-24-07 Management trends and outcomes assessment for ...
283 NaN NaN
[284 rows x 2 columns]