Why am I the crawler can only climb to a data-CodePudding

Home > other > Why am I the crawler can only climb to a data

Why am I the crawler can only climb to a data

Time：04-20

The code is as follows:

The import scrapy
The from bs4 import BeautifulSoup
The import re

The class StockSpider (scrapy. Spiders) :
Name='stock'
# allowed_domains=[' quote.eastmoney.com ']
Start_urls=[' http://quote.eastmoney.com/stock_list.html ']

Def parse (self, response) :
For href in response. The CSS (' a: : attr (href) '). The extract () :
Try:
Stock=re search (r "[s] [hz] \ d {6}", href). The group (0)
Stock=stock. The upper ()
Url='https://xueqiu.com/S/' + stock
Yield scrapy. Request (url, the callback=self. Parse_stock)
Except:
The continue

Def parse_stock (self, response) :
InfoDict={}
If the response=="" :
The exit ()
Try:
Name=re search (r '& lt; Div & gt; (. *?)

'and the response. The text). Group (1)
InfoDict. Update ({' stock name: the name. The __str__ ()})
TableHtml=re search (r '" tableHtml ":" (. *?) ", ', the response text.) group (1)
Soup=BeautifulSoup (tableHtml "HTML parser")
Table=soup. Table
For I in table. Find_all (" td ") :
The line=i.t ext
L=line. The split of the colon (" : ") # here for Chinese a colon (:)!!! Rather than in English (:)
InfoDict. Update ({l [0]. The __str__ () : l [1]. The __str__ ()})
Yield infoDict
Except:
Print (" error ")

Also, before you have a look at other people's crawler, run on my computer, pycharm anyway even the local database,
Will and I was using a bedroom WiFi?

CodePudding user response:

The result is as follows

Page link：https//www.codepudding.com/other/108582.html

Prev:WEB front end

Next:For help, in python thread pool task how to interrupt

Tags：

Scripting language (Perl/Python)

Related

Links：
CodePudding