Home > Mobile >  How to fetch stock exchange data with python
How to fetch stock exchange data with python

Time:07-14

I am writing a small program to fetch stock exchange data using python. The sample code below makes a request to a url and it should return the appropriate data. Here is the resource that I am using: https://python.plainenglish.io/4-python-libraries-to-help-you-make-money-from-webscraping-57ba6d8ce56d

from xml.dom.minidom import Element
from selenium import webdriver
from bs4 import BeautifulSoup
import logging
from selenium.webdriver.common.by import By
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

url = "http://eoddata.com/stocklist/NASDAQ/A.htm"
driver = webdriver.Chrome(executable_path="C:\Program Files\Chrome\chromedriver")
page = driver.get(url)
# TODO: find element by CSS selector
stock_symbol = driver.find_elements(by=By.CSS_SELECTOR, value='#ctl00_cph1_divSymbols')
soup = BeautifulSoup(driver.page_source, features="html.parser")
elements = []
table    = soup.find('div', {'id','ct100_cph1_divSymbols'})
logging.info(f"{table}")

I've added a todo for getting the element that I am trying to retrieve from the program.

Expected: The proper data should be returned.

Actual: Nothing is returned.

Any help would be greatly appreciated.

CodePudding user response:

The table data isn't dynamic. So you can mimic using bs4 with pandas or using only pandas

from bs4 import BeautifulSoup 
import requests
import pandas as pd

url = "https://eoddata.com/stocklist/NASDAQ/A.htm"
res = requests.get(url)
soup = BeautifulSoup(res.text,'html.parser').select_one('.quotes')
df= pd.read_html(str(soup))[0]
print(df)

Output:

ode                          Name      High  ...  Change.1  Change.2  Unnamed: 9
0     AACG     Ata Creativity Global ADR    1.4300  ...       NaN      2.99         NaN       
1     AACI     Armada Acquisition Corp I    9.8810  ...       NaN      0.11         NaN       
2    AACIU     Armada Acquisition Corp I    9.9600  ...       NaN      0.10         NaN       
3    AACIW  Armada Acquisition Corp I WT    0.1893  ...       NaN      0.32         NaN       
4     AADI          Aadi Biosciences Inc   13.4000  ...       NaN      2.70         NaN       
..     ...                           ...       ...  ...       ...       ...         ...       
565     AZ   A2Z Smart Technologies Corp    3.0200  ...       NaN     15.20         NaN       
566    AZN           Astrazeneca Plc ADR   67.5000  ...       NaN      0.03         NaN       
567   AZPN              Aspen Technology  189.7000  ...       NaN      4.67         NaN       
568   AZTA                    Azenta Inc   76.7800  ...       NaN      1.10         NaN       
569   AZYO      Aziyo Biologics Inc Cl A    6.1000  ...       NaN      3.00         NaN       

[570 rows x 10 columns]

To grab table data using pandas only:

import pandas as pd
url = "https://eoddata.com/stocklist/NASDAQ/A.htm"
df= pd.read_html(url,attrs={"class":"quotes"})[0]
print(df)

CodePudding user response:

It is most common practice to scrape tables with pandas.read_html() to get its texts, so I would also recommend it.

But to answer your question and follow your approach, select <div> and <table> more specific:

soup.select('#ctl00_cph1_divSymbols table')`

To get and store the data you could iterat the rows and append results to a list:

data = []
for row in soup.select('#ctl00_cph1_divSymbols table tr:has(td)'):
    d = dict(zip(soup.select_one('#ctl00_cph1_divSymbols table tr:has(th)').stripped_strings,row.stripped_strings))
    d.update({'url': 'https://eoddata.com' row.a.get('href')})
    data.append(d)
Example
from bs4 import BeautifulSoup 
import requests
import pandas as pd

url = "https://eoddata.com/stocklist/NASDAQ/A.htm"
res = requests.get(url)
soup = BeautifulSoup(res.text)

data = []
for row in soup.select('#ctl00_cph1_divSymbols table tr:has(td)'):
    d = dict(zip(soup.select_one('#ctl00_cph1_divSymbols table tr:has(th)').stripped_strings,row.stripped_strings))
    d.update({'url': 'https://eoddata.com' row.a.get('href')})
    data.append(d)
pd.DataFrame(data)
Output
Code Name High Low Close Volume Change url
0 AACG Ata Creativity Global ADR 1.390 1.360 1.380 8,900 0 https://eoddata.com/stockquote/NASDAQ/AACG.htm
1 AACI Armada Acquisition Corp I 9.895 9.880 9.880 5,400 -0.001 https://eoddata.com/stockquote/NASDAQ/AACI.htm
2 AACIU Armada Acquisition Corp I 9.960 9.960 9.960 300 -0.01 https://eoddata.com/stockquote/NASDAQ/AACIU.htm
3 AACIW Armada Acquisition Corp I WT 0.1900 0.1699 0.1700 36,400 -0.0193 https://eoddata.com/stockquote/NASDAQ/AACIW.htm
4 AADI Aadi Biosciences Inc 13.40 12.66 12.90 98,500 -0.05 https://eoddata.com/stockquote/NASDAQ/AADI.htm
5 AADR Advisorshares Dorsey Wright ETF 47.49 46.82 47.49 1,100 0.3 https://eoddata.com/stockquote/NASDAQ/AADR.htm
6 AAL American Airlines Gp 14.44 13.70 14.31 45,193,100 -0.46 https://eoddata.com/stockquote/NASDAQ/AAL.htm

...

  • Related