I am trying to scrape available list of stocks from a chartink screener, at any given time.
Example screener: https://chartink.com/screener/15-minute-stock-breakouts
The Inspect element option shows me stock names in between HTML tags (between 'td' and 'tr'). But when I print the output on Python page, stock names are missing (nothing available between 'td' and 'tr'). Leads me to suspect whether Chartink site is scraping-proof. Or maybe it's my limited knowledge.
Can you please give it a shot, and advise. And if not Python, would I be able to get the stock list via any other tool (like VBA)? I am using Microsoft Edge on Windows 11.
Below is the code. As you would see I have tried different things, but failed.
import pandas as pd
# from selenium import webdriver
# from selenium.webdriver.common.by import By
import numpy as np
import schedule
from datetime import datetime
import requests
from bs4 import BeautifulSoup
page = requests.get("https://chartink.com/screener/15-minute-stock-breakouts")
soup = BeautifulSoup(page.content, 'lxml')
# url = 'https://chartink.com/screener/15-minute-stock-breakouts'
# driver = webdriver.Edge(executable_path=r'C:\Users\kashk\Downloads\edgedriver_win64\msedgedriver.exe')
# driver.get(url)
# pd.read_html(driver.find_element(by=By.XPATH, value='//*[@id="DataTables_Table_0"]').get_attribute('outerHTML'))
CodePudding user response:
The data is loaded dynamically and is retrieved from a XHR so if you are using Selenium, you probably have to wait for the data to be loaded first.
Below method uses XMLHTTP approach and seems to work for me:
Option Explicit
Sub Chartink()
Dim reqObj As Object
Set reqObj = CreateObject("MSXML2.XMLHTTP")
With reqObj
.Open "GET", "https://chartink.com/screener/15-minute-stock-breakouts", False
Dim reqDoc As Object
Set reqDoc = CreateObject("HTMLFile")
reqDoc.body.innerHTML = .responseText
'Retrieve the CSRF token that is required for XHR later
Dim metaEle As Object
Set metaEle = reqDoc.getElementsByName("csrf-token")(0)
'Retrieve the JSON data
.Open "POST", "https://chartink.com/screener/process", False
.setRequestHeader "x-csrf-token", metaEle.Content
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded; charset=UTF-8"
.Send "scan_clause=( {57960} ( [0] 15 minute close > [-1] 15 minute max( 20 , [0] 15 minute close ) and [0] 15 minute volume > [0] 15 minute sma( volume,20 ) ) ) "
Dim resultDict As Scripting.Dictionary
Set resultDict = JsonConverter.ParseJson(.responseText)
Dim i As Long
For i = 1 To resultDict("data").Count
Debug.Print resultDict("data")(i)("name") & vbTab & resultDict("data")(i)("close") & vbTab & resultDict("data")(i)("volume")
Next i
End With
End Sub
You will need VBA-JSON and reference to Microsoft Scripting Runtime
for the JsonConverter.ParseJson