Home > Enterprise >  Google Finance Stock Screener - Python (Scrapy)
Google Finance Stock Screener - Python (Scrapy)

Time:11-16

I am trying to scrape stock prices from google finance using scrapy. The code is not showing any errors but the output file is coming out to be blank.

Pasting the code below:

import scrapy

bse_list=['quote/ABB:NSE','quote/AEGISLOG:NSE','quote/AMARAJABAT:NSE','quote/AMBALALSA:NSE','quote/HDFC:NSE','quote/ANDHRAPET:NSE','quote/ANSALAPI:NSE']

class CrawlSpider(scrapy.Spider):
name = 'crawl'
allowed_domains = ['www.google.com/finance/']
start_urls = ['https://google.com/finance/']

def parse(self, response):
    for stock in bse_list:
        url_new = response.urljoin(stock)
        yield scrapy.Request(url_new, callback = self.parse_book)

def parse_book(self, response):
    stock_name = response.xpath('//*[@]/text()').extract_first()
    current_price = response.xpath('//*[@]/text()').extract_first()
    stock_info = response.xpath('//*[@]/text()').extract()

    last_closing_price = stock_info[0]
    day_range = stock_info[1]
    year_range = stock_info[2]
    market_cap = stock_info[3]
    p_e_ratio = stock_inf[4]
    
    yield {
    "stock_name": stock_name,
    "current_price": current_price,
    "last_closing_price": last_closing_price,
    "day_range": day_range,
    "year_range": year_range,
    "market_cap": market_cap,
    "p_e_ratio": p_e_ratio
    }

CodePudding user response:

The problem is in the stock info selection and rest of your code is working fine.

import scrapy

bse_list = ['quote/ABB:NSE', 'quote/AEGISLOG:NSE', 'quote/AMARAJABAT:NSE',
            'quote/AMBALALSA:NSE', 'quote/HDFC:NSE', 'quote/ANDHRAPET:NSE', 'quote/ANSALAPI:NSE']


class CrlSpider(scrapy.Spider):
    name = 'crl'
   
    start_urls = ['https://google.com/finance/']


    def parse(self, response):
        for stock in bse_list:
            url_new = response.urljoin(stock)
            yield scrapy.Request(url_new, callback=self.parse_book)


    def parse_book(self, response):
        stock_name = response.xpath('//*[@]/text()').extract_first()
        current_price = response.xpath('//*[@]/text()').extract_first()
        #stock_info = response.xpath('//*[@]/text()').extract()

        #last_closing_price = stock_info[0]
        # day_range = stock_info[1]
        # year_range = stock_info[2]
        # market_cap = stock_info[3]
        # p_e_ratio = stock_inf[4]

        yield {
            "stock_name": stock_name,
            "current_price": current_price,
            #"last_closing_price": last_closing_price,
            # "day_range": day_range,
            # "year_range": year_range,
            # "market_cap": market_cap,
            # "p_e_ratio": p_e_ratio
        }

Output:

{'stock_name': 'Ansal Properties and Infrastructure Ltd', 'current_price': '₹13.30'}       
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/ANDHRAPET:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/AMBALALSA:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/AEGISLOG:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/ABB:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:09 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/HDFC:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:09 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.google.com/finance/quote/AMARAJABAT:NSE> (referer: https://www.google.com/finance/)
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/ANDHRAPET:NSE>
{'stock_name': None, 'current_price': None}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/AMBALALSA:NSE>
{'stock_name': None, 'current_price': None}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/AEGISLOG:NSE>
{'stock_name': None, 'current_price': None}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/ABB:NSE>
{'stock_name': 'ABB India Ltd', 'current_price': '₹2,139.00'}
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/HDFC:NSE>
{'stock_name': 'Housing Development Finance Corp Ltd', 'current_price': '₹2,994.15'}       
2021-11-15 20:18:09 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.google.com/finance/quote/AMARAJABAT:NSE>
{'stock_name': 'Amara Raja Batteries Ltd', 'current_price': '₹685.40'}
  • Related