Home > Blockchain >  python async requests_html div not loaded with JS (?) data
python async requests_html div not loaded with JS (?) data

Time:06-08

Im trying to fetch links to guides about given class and it's style of play. Here on screenshot highlighted in yellow is the div responsible for their rendering. I need to use async since this class is used in discord.py bot and trying to use HTMLSession() resulted in error saying i need to use AsyncHTMLSession. Website address - enter image description here

But my code outputs this div as empty

code

from requests_html import AsyncHTMLSession

asession = AsyncHTMLSession()

class Scrapper:
    def __init__(self):
        self.headers = {'User-Agent':'Mozilla/5.0 (X11; Linux x86_64; rv:100.0) Gecko/20100101 Firefox/100.0'}

    async def return_page(self, url):
        
        response = await asession.get(url)
        await response.html.arender(timeout=15, sleep=5)
        #article = response.html.find('#filter-results', first=True)
        print(response.html.html)
        return response

    async def return_build_articles(self, userClass, instance):
        
        url = f"https://immortal.maxroll.gg/category/build-guides#classes=[di-{userClass}]&metas=[di-{instance}]"

        
        
        articles = await self.return_page(url)

selected part of the output

</div>
</div>
</div>
</div>
</div>
</form> <hr >
<div id="immortal-mobile-mid-banner" ></div>
<div id="filter-results" ><!-- here should be the results --></div>
<div  role="navigation"></div>
</div>
</div>
</div>

CodePudding user response:

The data you see on the page is loaded with Javascript from external URL. To get the data asynchronously you can use asynchttp package. For example:

import json
import asyncio
import aiohttp


url = "https://site-search-origin.maxroll.gg/indexes/wp_posts_immortal/search"

data = {
    "filters": '(classes = "di/necromancer") AND (metas = "di/PvP") AND (category = "Build Guides")',
    "limit": 1000,
    "offset": 0,
    "q": "",
}

headers = {
    "X-Meili-API-Key": "3c58012ad106ee8ff2c6228fff2161280b1db8cda981635392afa3906729bade"
}


async def main():
    async with aiohttp.ClientSession() as session:
        async with session.post(url, json=data, headers=headers) as resp:
            json_data = await resp.json()

    # uncomment to print all data:
    # print(json.dumps(json_data, indent=4))

    for hit in json_data["hits"]:
        print(hit["post_title"])
        print(hit["permalink"])
        print("-" * 80)


asyncio.run(main())

Prints:

Bone Spikes Necromancer PvP Guide
https://immortal.maxroll.gg/build-guides/bone-spikes-necromancer-pvp-guide-battlegrounds-rite-of-exile
--------------------------------------------------------------------------------
Bone Wall Necromancer PvP Guide
https://immortal.maxroll.gg/build-guides/bone-wall-necromancer-pvp-guide-battlegrounds-rite-of-exile
--------------------------------------------------------------------------------
  • Related