Home > database >  Share a very simple VFP crawler, crawl a web site all the types of images.
Share a very simple VFP crawler, crawl a web site all the types of images.

Time:09-21

Paste the code below, are for reference only, because can't multi-threading, you can use the method of multiple processes, is the point of trouble, all files total size is about 17 gb, the 100 m broadband half an hour, I
Note: if there are any directory repeated hints, point can be ignored (probably only 4 times), all of them are 404 directory, didn't do filtering, too little,
 IF! DIRECTORY (' \ PIC ') 
The MKDIR (' \ PIC ')
ENDIF
FOR pg=1 TO 100 & amp; & This place can be divided into 10, 100 and then compile out 10 EXE, use of multiple processes, speed up the climbing,
Lnnum=30
IF pg & lt; 2
Lcpgurl='https://www.nvshens.com/gallery/yazhou'
The ELSE
Lcpgurl='https://www.nvshens.com/gallery/yazhou/' + ALLTRIM (STR) (pg) + 'HTML'
ENDIF
The LOCAL oxhttp AS Microsoft. XMLHTTP
Oxhttp=CREATEOBJECT (" Microsoft. XMLHTTP ")
Oxhttp. OPEN (" GET ", lcpgurl. F.)
Oxhttp. The SEND ()
SourceCode=STRCONV (oxhttp responseBody, 11)
RELEASE oxhttp
FOR a=1 TO lnnum
LCHTML='https://www.nvshens.com' + STREXTRACT (SourceCode, "' & gt; Lcfolder=ALLTRIM (STREXTRACT (SourceCode1, '& lt; title> ', '& lt;/title> ', 1))
Lcfolder=STRTRAN (lcfolder, '|', ')
Lcfolder=STRTRAN (lcfolder, '& lt; ', ' ')
Lcfolder=STRTRAN (lcfolder, '& gt; ', ' ')
Lcfolder=STRTRAN (lcfolder, '/', ')
Lcfolder=STRTRAN (lcfolder, '\', ')
Lcfolder=STRTRAN (lcfolder, ':', ')
Lcfolder=STRTRAN (lcfolder, '"', ' ')
Lcfolder=STRTRAN (lcfolder, the '*', ')
Lcfolder=STRTRAN (lcfolder, '? ', ' ')
IF! DIRECTORY (' \ PIC \ '+ lcfolder)
The MKDIR (' \ PIC \ '+ lcfolder)
The ELSE
Lcfolder=lcfolder + '_NEW'
The MKDIR (' \ PIC \ '+ lcfolder)
ENDIF
Lcpicurlbase=STREXTRACT (SourceCode1, "& lt; Img SRC=""," 0. JPG ", 1)
RELEASE oxhttp1
A FOR b=1 TO lnpicnum
IF b & lt; 2
Lcpicurl=lcpicurlbase + 0. JPG "
The ELSE
DO CASE
CASE b & lt; 11.
Lcpicurl lcpicurlbase + '00' +=ALLTRIM (STR (1 b)) + 'JPG'
CASE b & lt; 101
Lcpicurl=lcpicurlbase + '0' + ALLTRIM (STR (1 b)) + 'JPG'
OTHERWISE
Lcpicurl=lcpicurlbase + ALLTRIM (STR (1 b)) + 'JPG'
ENDCASE
ENDIF
The LOCAL oxhttp2 AS Microsoft. XMLHTTP
Oxhttp2=CREATEOBJECT (" Microsoft. XMLHTTP ")
Oxhttp2. OPEN (" GET ", lcpicurl. F.)
Oxhttp2. The SEND ()
IF! ISNULL (oxhttp2 responseBody)
STRTOFILE (oxhttp2 responseBody, '\ PIC \' + lcfolder + '\' + ALLTRIM (STR) (b) + 'JPG')
ENDIF
RELEASE oxhttp2
ENDFOR
ENDIF
ENDFOR
ENDFOR
MESSAGEBOX (" All done!!!!! ", 48, "Tips")

CodePudding user response:

For study,

CodePudding user response:

This study can be,
  •  Tags:  
  • VFP
  • Related