The from bs4 import BeautifulSoup
Url="http://product.dangdang.com/23950158.html"
Headers={' the user-agent ':' Mozilla/5.0 (Windows NT 10.0; Win64. X64; The rv: 71.0) Gecko/20100101 Firefox/71.0 '}
R=requests. Get (url=url, headers=headers, timeout=50)
Print (r.t ext)
Code is very simple, that's right, sometimes run out like this:
data:image/s3,"s3://crabby-images/a6aad/a6aad29741e5bcd3208788b877094fb3759088d3" alt=""
That, of course, is to want, but sometimes it is like this:
data:image/s3,"s3://crabby-images/7c9e5/7c9e5a2a34b6e39212ea59f5739265ac26345613" alt=""
Have a great god tell me what to do? How the response is stable in the first picture?
Scrapy next to do?
CodePudding user response:
If it is a dynamic web pages not render advice to use scrapy - splashCodePudding user response: