I have been trying to check the status of all these subdomains all at once and I have tried multiple techniques even grequests and faster than requests wasn't much helpful and then i started using asyncio with aiohttp and it is slower than normal requests library now. Also i checked that it wasn't actually sending the requests asynchronously rather it was sending one after another.
I know that "await resp.status" has issues because resp.status does not support await but i tried removing it and its still the same.
Any help would be much appreciated! Thanks!
import aiohttp
import asyncio
import time
start_time = time.time()
async def main():
#List of 1000 subdomains , Some subdomains do not exist
data = [ "LIST OF 1000 SUBDOMAINS" ]
async with aiohttp.ClientSession() as session:
for url in data:
pokemon_url = f'{url}'
try:
async with session.get(pokemon_url, ssl=False) as resp:
pokemon = await resp.status
#If subdomain exists then print the status
print(pokemon)
except:
#else print the subdomain which does not exist or cannot be reached
print(url)
asyncio.run(main())
print("--- %s seconds ---" % (time.time() - start_time))
CodePudding user response:
I have tried multiple techniques even grequests
grequests
works fine for this, you don't have to use async if you don't want.
import grequests
import time
urls = ['https://httpbin.org/delay/4' for _ in range(4)]
# each of these requests take 4 seconds to complete
# serially, these would take at least 16 (4 * 4) seconds to complete
reqs = [grequests.get(url) for url in urls]
start = time.time()
for resp in grequests.imap(reqs, size=4):
print(resp.status_code)
end = time.time()
print('finished in', round(end-start, 2), 'seconds')
200
200
200
200
finished in 4.32 seconds