Im trying to fetch links to guides about given class and it's style of play. Here on screenshot highlighted in yellow is the div responsible for their rendering. I need to use async since this class is used in discord.py bot and trying to use HTMLSession() resulted in error saying i need to use AsyncHTMLSession. Website address -
But my code outputs this div as empty
code
from requests_html import AsyncHTMLSession
asession = AsyncHTMLSession()
class Scrapper:
def __init__(self):
self.headers = {'User-Agent':'Mozilla/5.0 (X11; Linux x86_64; rv:100.0) Gecko/20100101 Firefox/100.0'}
async def return_page(self, url):
response = await asession.get(url)
await response.html.arender(timeout=15, sleep=5)
#article = response.html.find('#filter-results', first=True)
print(response.html.html)
return response
async def return_build_articles(self, userClass, instance):
url = f"https://immortal.maxroll.gg/category/build-guides#classes=[di-{userClass}]&metas=[di-{instance}]"
articles = await self.return_page(url)
selected part of the output
</div>
</div>
</div>
</div>
</div>
</form> <hr >
<div id="immortal-mobile-mid-banner" ></div>
<div id="filter-results" ><!-- here should be the results --></div>
<div role="navigation"></div>
</div>
</div>
</div>
CodePudding user response:
The data you see on the page is loaded with Javascript from external URL. To get the data asynchronously you can use asynchttp
package. For example:
import json
import asyncio
import aiohttp
url = "https://site-search-origin.maxroll.gg/indexes/wp_posts_immortal/search"
data = {
"filters": '(classes = "di/necromancer") AND (metas = "di/PvP") AND (category = "Build Guides")',
"limit": 1000,
"offset": 0,
"q": "",
}
headers = {
"X-Meili-API-Key": "3c58012ad106ee8ff2c6228fff2161280b1db8cda981635392afa3906729bade"
}
async def main():
async with aiohttp.ClientSession() as session:
async with session.post(url, json=data, headers=headers) as resp:
json_data = await resp.json()
# uncomment to print all data:
# print(json.dumps(json_data, indent=4))
for hit in json_data["hits"]:
print(hit["post_title"])
print(hit["permalink"])
print("-" * 80)
asyncio.run(main())
Prints:
Bone Spikes Necromancer PvP Guide
https://immortal.maxroll.gg/build-guides/bone-spikes-necromancer-pvp-guide-battlegrounds-rite-of-exile
--------------------------------------------------------------------------------
Bone Wall Necromancer PvP Guide
https://immortal.maxroll.gg/build-guides/bone-wall-necromancer-pvp-guide-battlegrounds-rite-of-exile
--------------------------------------------------------------------------------