I have tried many variations suggested here already but I have yet to fix the problem. I started with
page = requests.get('http://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=industries§or=10')
df_list = pd.read_html(page.text)
and I can see the correct headers so I am looking at the right location. I then tried changing the flavors to bs4 and html5lib with no change. I always see NaN for the data values and only have one index, index 0, when there should be 3 or 4. My original attempt is the same as another section of code for a different table from the same website and it worked perfectly. (also first post, please let me know how I can improve them)
CodePudding user response:
Unfortunately, I had to use selenium to retrieve the dataframe. But if that is not a problem feel free to try the following:
from selenium import webdriver
import pandas as pd
driver = webdriver.Chrome('<PATH_TO_WEBDRIVER>')
driver.get('https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=industries§or=10')
df = pd.read_html(driver.find_element_by_id('tableSort').get_attribute('outerHTML'))[0]