Home > other >  [the selenium reptiles crawl douban photos]
[the selenium reptiles crawl douban photos]

Time:12-03

The code is as follows:
The import requests
The import OS
The from LXML import etree
The from the selenium import webdriver
The import time
Path="D:/img"
Def the download (SRC, id) :
Picpath='/' path + + STR (id) + 'JPG'
Try:
PIC=requests. Get (SRC, timeout=20)
Except requests. Exception. ConnectionError:
Print (" images are downloaded failure ")
Fp=open (picpath, "wb")
Fp. Write (PIC. The content)
Fp. Close ()


Driver_path="C: \ Program Files \ (x86) Google, Chrome, Application, chromedriver. Exe"
Browser=webdriver. Chrome (executable_path=driver_path)

Url="https://movie.douban.com/subject_search? Search_text='+' joey wong '+' & amp; The cat=1002 '+' & amp; Start='+ STR (1)
The get (url)
HTML=etree. HTML (the page_source) # view page source code
SRC="https://img9.doubanio.com/view/photo/s_ratio_poster/public/p498333714.webp"
src=https://bbs.csdn.net/topics/src.replace (" s_ratio_poster ", "l")
SRC=https://bbs.csdn.net/topics/src.replace (' s_ratio_celebrity ', 'l')
src=https://bbs.csdn.net/topics/src.replace (" webp ", "JPG")

Title='beautiful Shanghai? (2004) : '

The download (SRC, title) # to download function that to download
Print (" downloaded ")
The close () # download after close the browser
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- line -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --

And then download the pictures are unable to open, display error for the picture, this is why ah
  • Related