I would like to grab data in the Key ratio table from the below url https://financials.morningstar.com/ratios/r.html?t=0P000000B7&culture=en&platform=sal
I tried to access it by selenium using xpath but in vein even I switched to iframe.
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
import time
url = 'https://www.morningstar.com/stocks/xnas/amzn/quote'
browser = webdriver.Firefox()
#Open the URL in browser
browser.get(url)
#####Click Key ratio button#####
#Wait until the Key ratio button is clickable
element = WebDriverWait(browser, 10).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="keyStats"]')))
#Click keyRatioBtn
element.click()
#####Click Full Key ratio url#####
#Wait until the Full Key ratio url is clickable
element = WebDriverWait(browser, 10).until(EC.element_to_be_clickable((By.XPATH, '//a[@]')))
#Click Full Key ratio url
element.click()
#####Get ROE list#####browser.implicitly_wait(20)
time.sleep(5)
iframe = browser.find_elements(By.TAG_NAME, 'iframe')
browser.switch_to.frame(1)
roeList = browser.find_element_by_xpath('//*[@id="tab-profitability"]')
print(roeList.get_attribute('innerHTML'))
Please help. Many thx
CodePudding user response:
You can grab data in the Key ratio table using selenium with pandas as follows:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
import pandas as pd
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
options.add_argument("--disable-infobars")
options.add_argument("start-maximized")
options.add_argument("--disable-extensions")
options.add_experimental_option("detach", True)
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()),options=options)
URL ='https://financials.morningstar.com/ratios/r.html?t=0P000000B7&culture=en&platform=sal'
driver.get(URL)
table = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, '//*[@id="tab-profitability"]'))).get_attribute("outerHTML")
df = pd.read_html(table)[0]
print(df.dropna(how='all'))
Output:
Margins % of Sales 2012-12 2013-12 2014-12 ... 2019-12 2020-12 2021-12 TTM
1 Revenue 100.00 100.00 100.00 ... 100.00 100.00 100.00 100.00
3 COGS 93.23 93.12 93.04 ... 86.16 86.66 85.89 86.59
5 Gross Margin 6.77 6.88 6.96 ... 13.84 13.34 14.11 13.41
7 SG&A 5.41 5.72 6.61 ... 8.58 7.43 8.81 9.23
9 R&D — — — ... — — — —
11 Other 0.26 0.15 0.15 ... 0.07 -0.02 0.01 0.06
13 Operating Margin 1.11 1.00 0.20 ... 5.18 5.93 5.30 4.12
15 Net Int Inc & Other -0.22 -0.32 -0.32 ... -0.20 0.33 2.82 0.61
17 EBT Margin 0.89 0.68 -0.12 ... 4.98 6.26 8.12 4.73
[9 rows x 12 columns]
CodePudding user response:
Within the website clicking on the link with text Full Key Ratios Data opens an adjascent tab. So you have switch to the new tab inducing WebDriverWait for number_of_windows_to_be(2) and using dataframe from pandas you can use the following solution:
Code Block:
driver.get("https://www.morningstar.com/stocks/xnas/amzn/quote") windows_before = driver.current_window_handle WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[@id='keyStats']"))).click() WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[contains(., 'Full Key Ratios Data')]"))).click() WebDriverWait(driver, 10).until(EC.number_of_windows_to_be(2)) windows_after = driver.window_handles new_window = [x for x in windows_after if x != windows_before][0] driver.switch_to.window(new_window) data = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[@class='r_table1 text2 print97']"))).get_attribute("outerHTML") df = pd.read_html(data) print(df)
Console Output:
[ Margins % of Sales 2012-12 2013-12 2014-12 2015-12 2016-12 2017-12 2018-12 2019-12 2020-12 2021-12 TTM 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 Revenue 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 3 COGS 93.23 93.12 93.04 91.21 89.69 89.84 86.75 86.16 86.66 85.89 86.59 4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 5 Gross Margin 6.77 6.88 6.96 8.79 10.31 10.16 13.25 13.84 13.34 14.11 13.41 6 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 7 SG&A 5.41 5.72 6.61 6.54 7.11 7.73 7.79 8.58 7.43 8.81 9.23 8 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 9 R&D — — — — — — — — — — — 10 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11 Other 0.26 0.15 0.15 0.16 0.12 0.12 0.13 0.07 -0.02 0.01 0.06 12 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 13 Operating Margin 1.11 1.00 0.20 2.09 3.08 2.31 5.33 5.18 5.93 5.30 4.12 14 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 15 Net Int Inc & Other -0.22 -0.32 -0.32 -0.62 -0.22 -0.17 -0.50 -0.20 0.33 2.82 0.61 16 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 17 EBT Margin 0.89 0.68 -0.12 1.47 2.86 2.14 4.84 4.98 6.26 8.12 4.73 18 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN]