Home > Back-end >  (Python) How can I apply asyncio in while loop with accumulator?
(Python) How can I apply asyncio in while loop with accumulator?

Time:12-11

I have a block of codes that works well in fetching data from API requests to a specific site. The issue is that the site only gives me a limit of 50 objects for each call, and I have to make multiple calls. As a result, it takes me too long to finish the fetching work (sometimes I have to wait nearly 20 minutes). Here are my codes:

import concurrent.futures
import requests

supply = 3000
offset = 0

token_ids = []
while offset < supply:
   url = "url_1"   str(offset)
   response = requests.request("GET", url)
   a = response.json()
   assets = a["assets"]

   def get_token_ids(an):
       if str(an['sell_orders']) == 'None' and str(an['last_sale']) == 'None' and str(an['num_sales']) == '0':
       token_ids.append(str(an['token_id']))


    with concurrent.futures.ThreadPoolExecutor() as executor:
        results = [executor.submit(get_token_ids, asset) for asset in assets]

    offset  = 50

print(token_ids)

The problem is that the codes run through and wait for all actions to be finished before making another request. I am thinking of an improvement that when the request is sent, the offset value gets added, and the loop processes to another request, thus I don't have to wait. I don't know how to do it, I studied 'asyncio', but it is still a challenge for me. Can anyone help me with this?

CodePudding user response:

The problem is that Requests is not asynchronous code, so each of its network calls blocks the loop until its completion.

https://docs.python-requests.org/en/latest/user/advanced/#blocking-or-non-blocking

Therefore, it is better to try asynchronous libraries, for example, aiohttp:

https://github.com/aio-libs/aiohttp

Example

Create session for all connections:

async with aiohttp.ClientSession() as session:

and run all desired requests:

        results = await asyncio.gather(
            *[get_data(session, offset) for offset in range(0, supply, step)]
        )

here, requests are executed asynchronously, with session.get(url) gets only the response headers, and the content gets await response.json():

    async with session.get(url) as response:
        a = await response.json()

And in the main block main loop starts:

    loop = asyncio.get_event_loop()
    token_ids = loop.run_until_complete(main())
    loop.close()

The full code

import aiohttp
import asyncio


async def get_data(session, offset):

    token_ids = []
    url = "url_1"   str(offset)

    async with session.get(url) as response:
        # For tests:
        # print("Status:", response.status)
        # print("Content-type:", response.headers['content-type'])
        a = await response.json()

    assets = a["assets"]

    for asset in assets:
        if str(asset['sell_orders']) == 'None' and str(asset['last_sale']) == 'None' and str(asset['num_sales']) == '0':
            token_ids.append(str(asset['token_id']))

    return token_ids


async def main():
    supply = 3000
    step = 50
    token_ids = []
    # Create session for all connections and pass it to "get" function
    async with aiohttp.ClientSession() as session:
        results = await asyncio.gather(
            *[get_data(session, offset) for offset in range(0, supply, step)]
        )

    for ids in results:
        token_ids.extend(ids)

    return token_ids


if __name__ == "__main__":
    # asynchronous code start here
    loop = asyncio.get_event_loop()
    token_ids = loop.run_until_complete(main())
    loop.close()
    # asynchronous code end here

    print(token_ids)
  • Related