Home > Enterprise >  Scraping stock names from Chartink screener
Scraping stock names from Chartink screener

Time:03-21

I am trying to scrape available list of stocks from a chartink screener, at any given time.

Example screener: https://chartink.com/screener/15-minute-stock-breakouts

The Inspect element option shows me stock names in between HTML tags (between 'td' and 'tr'). But when I print the output on Python page, stock names are missing (nothing available between 'td' and 'tr'). Leads me to suspect whether Chartink site is scraping-proof. Or maybe it's my limited knowledge.

Can you please give it a shot, and advise. And if not Python, would I be able to get the stock list via any other tool (like VBA)? I am using Microsoft Edge on Windows 11.

Below is the code. As you would see I have tried different things, but failed.

import pandas as pd
# from selenium import webdriver
# from selenium.webdriver.common.by import By
import numpy as np
import schedule
from datetime import datetime
import requests
from bs4 import BeautifulSoup

page = requests.get("https://chartink.com/screener/15-minute-stock-breakouts")
soup = BeautifulSoup(page.content, 'lxml')
print(soup)

# url = 'https://chartink.com/screener/15-minute-stock-breakouts'
# driver = webdriver.Edge(executable_path=r'C:\Users\kashk\Downloads\edgedriver_win64\msedgedriver.exe')
# driver.get(url)
# pd.read_html(driver.find_element(by=By.XPATH, value='//*[@id="DataTables_Table_0"]').get_attribute('outerHTML'))

CodePudding user response:

The data is loaded dynamically and is retrieved from a XHR so if you are using Selenium, you probably have to wait for the data to be loaded first.

Below method uses XMLHTTP approach and seems to work for me:

Option Explicit

Sub Chartink()
    Dim reqObj As Object
    Set reqObj = CreateObject("MSXML2.XMLHTTP")
    
    With reqObj
        .Open "GET", "https://chartink.com/screener/15-minute-stock-breakouts", False
        .Send
        
        Dim reqDoc As Object
        Set reqDoc = CreateObject("HTMLFile")
        reqDoc.body.innerHTML = .responseText
        
        'Retrieve the CSRF token that is required for XHR later
        Dim metaEle As Object
        Set metaEle = reqDoc.getElementsByName("csrf-token")(0)
    
        'Retrieve the JSON data
        .Open "POST", "https://chartink.com/screener/process", False
        .setRequestHeader "x-csrf-token", metaEle.Content
        .setRequestHeader "Content-Type", "application/x-www-form-urlencoded; charset=UTF-8"
        .Send "scan_clause=( {57960} ( [0] 15 minute close > [-1] 15 minute max( 20 , [0] 15 minute close ) and [0] 15 minute volume > [0] 15 minute sma( volume,20 ) ) ) "
                
        Dim resultDict As Scripting.Dictionary
        Set resultDict = JsonConverter.ParseJson(.responseText)
        
        Dim i As Long
        For i = 1 To resultDict("data").Count
            Debug.Print resultDict("data")(i)("name") & vbTab & resultDict("data")(i)("close") & vbTab & resultDict("data")(i)("volume")
        Next i
    End With
End Sub

You will need VBA-JSON and reference to Microsoft Scripting Runtime for the JsonConverter.ParseJson method.

  • Related