Home > Software design >  Web scrape data from exchange using API
Web scrape data from exchange using API

Time:11-27

I am looking to web scrape the second table containing the "Number of Insider Shares Traded" from the following website:

https://www.nasdaq.com/market-activity/stocks/aapl/insider-activity

Preferably I need someone to show how to use the Nasdaq api if possible. I believe the way I'd normally webscrape (using beautifulSoup) would be inefficient for this task.

I have some existing code that helps obtain data from the same website using it's api but for different information. Preferably, I just need a different api endpoint and then make some tweaks following simlar structure to the below code:

import requests
import json

nasdaq_dict = {}

url = 'https://api.nasdaq.com/api/company/AAPL/institutional-holdings?limit=15&type=TOTAL&sortColumn=marketValue&sortOrder=DESC'

headers = {
    'accept': 'application/json, text/plain, */*',
    'origin': 'https://www.nasdaq.com',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.5112.79 Safari/537.36'
}

r = requests.get(url, headers=headers)

nasdaq_dict['activePositions'] = r.json()['data']['activePositions']['rows']
nasdaq_dict['newSoldOutPositions'] = r.json()['data']['newSoldOutPositions']['rows']

with open('AAPL_institutional_holdings.json', 'w') as f:
    json.dump(nasdaq_dict, f, indent=4)

CodePudding user response:

Here is one way of getting that data (as a dictionary: please say if you want it as a table):

import requests

headers = {
    'accept-language': 'en-US,en;q=0.9',
    'origin': 'https://www.nasdaq.com/',
    'referer': 'https://www.nasdaq.com/',
    'accept': 'application/json, text/plain, */*',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36'
}

data = requests.get('https://api.nasdaq.com/api/company/AAPL/insider-trades?limit=15&type=ALL&sortColumn=lastDate&sortOrder=DESC', headers=headers).json()['data']['numberOfSharesTraded']
print(data)

Result in terminal:

{'headers': {'insiderTrade': 'INSIDER TRADE', 'months3': '3 MONTHS', 'months12': '12 MONTHS'}, 'rows': [{'insiderTrade': 'Number of Shares Bought', 'months3': '0', 'months12': '0'}, {'insiderTrade': 'Number of Shares Sold', 'months3': '1,317,881', 'months12': '1,986,819'}, {'insiderTrade': 'Total Shares Traded', 'months3': '1,317,881', 'months12': '1,986,819'}, {'insiderTrade': 'Net Activity', 'months3': '(1,317,881)', 'months12': '(1,986,819)'}]}
  • Related