Air is the personage inside course of study, need to focus on multiple segment date ticket information, real-time data demand is bigger, so wrote a ctrip ticket information crawler, try to use original aiohttp, asyncio, switching from original requests, no matter how to slow down, a small portion of the data after the ctrip will always prompt access to fast, then the pop-up verification, should be suffered the climb, for the climb, have tried to use random user_agent, random agent (online copy is low, no effectiveness verification proxy mechanism), time. Sleep and so on, a great god how to modify the code, please? Or to provide a more efficient agent pool code? Or how to deal with site verification mechanism automatically?
CodePudding user response:
About the two problems - & gt; Charge to become more strong!CodePudding user response:
"Or provide a more efficient agent pool code? Or how to deal with site verification mechanism automatically?"- efficient agent pool code based on proxy source, the quality of source quality too low agents pool could not efficient, suggest to take penny money to buy high quality agent,
- in a clear site validation mechanism, in addition to conventional ua, advice to start debugging sleep, from big to test site detection threshold, also can try the phone the crawl, detection is relatively weak,
CodePudding user response:
Set a timesleep