Home > other > Small white crawler self-study, why results show no?
Small white crawler self-study, why results show no?
Time:11-21
The crawler results run out? Why is this? This is what the crawler? Please eldest brother elder sister guide ~ thank you ~
import requests The from bs4 import BeautifulSoup
Headers={ 'origin' : 'https://m.maizuo.com', 'referer' : 'https://m.maizuo.com/v5/? Co=mzmovie ', 'the user-agent: "Mozilla/5.0 (Windows NT 5.1; The rv: 5.0) Gecko/20100101 Firefox/5.0 "} Url="https://m.maizuo.com/v5/? Co=mzmovie# control-insistent pang/nowPlaying ' # be explained url Res=requests. Get (url, headers=headers) # to get the data Print (res) status_code) The state of the # access url Bs=BeautifulSoup (res) text, '. The HTML parser) # explain data Soups=bs. Find_all (class_="nowPlayingFilm - item") # find all this class For soup in soups: Movie_name=soup. The find (class_="name") Print (movie_name)
CodePudding user response:
This is just a ordinary asynchronous loading, which means to load the first web framework, load data fill again, you just get the original data, you should pay attention to the request as shown, And, it is far from enough, headers get headers down directly from the web site,
The import requests
The header={ 'Accept' : 'application/json, text/plain, */*', 'the Accept - Encoding' : 'gzip, deflate, br', 'the Accept - Language' : 'useful - CN, useful; Q=0.9 ', 'cache-control:' no - Cache, 'Connection' : 'keep - the alive', 'Host' : 'm.maizuo.com', 'Pragma' : 'no - cache', 'Referer' : 'https://m.maizuo.com/v5/? Co=mzmovie ', 'the Sec - Fetch - Dest' : 'empty' 'the Sec - Fetch - Mode:' cors, 'the Sec - Fetch - Site' : 'the same - origin, 'the user-agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64. X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36 ', 'X - the Client - Info' : '{" a ":" 3000 ", "ch" : "1002", "v" : "5.0.4", "e" : "1597407020374774551281665", "BC" : "110100"}', 'X - Host' : 'mall. The film - ticket. Film. List', 'X - Requested - With' : 'the XMLHttpRequest, 'X - Token' : 'undefined, }
Url="https://m.maizuo.com/gateway? CityId=110100 & amp; PageNum=2 & amp; PageSize=10 & amp; Type=1 & amp; K=1825547 ' The req=requests. Get (url, headers=headers) Print (the req. Content. decode (' utf8))