Home > Back-end >  Does html have containers that could not seen with soup method?
Does html have containers that could not seen with soup method?

Time:08-11

I am trying to make an app that takes values from a website. For instance, from [https://steamcommunity.com/id/pintipanda/games/?tab=all] this page I want to get every id of the div that are classed as "gameListRow".enter image description here

But when I try:

from bs4 import BeautifulSoup
import requests

html_text = requests.get('https://steamcommunity.com/id/pintipanda/games/?tab=all').text
soup = BeautifulSoup(html_text, 'lxml')
div = soup.find_all('div', {'class': 'gameListRow'})

print(div)

It prints an empty list. How to choose all boxes classed under gameListRow?

CodePudding user response:

The data you see is stored inside <script> on the page (so beautifulsoup doesn't see it). To parse it, you can use this example:

import re
import json
import requests


url = "https://steamcommunity.com/id/pintipanda/games/?tab=all"

data = requests.get(url).text
data = re.search(r"var rgGames = (.*]);", data).group(1)
data = json.loads(data)

# uncomment to print all data:
# print(json.dumps(data, indent=4))

for d in data:
    print("{:<10} {}".format(d["appid"], d["name"]))

Prints:

730        Counter-Strike: Global Offensive
578080     PUBG: BATTLEGROUNDS
261550     Mount & Blade II: Bannerlord
570        Dota 2
305620     The Long Dark
550        Left 4 Dead 2
413150     Stardew Valley


...
  • Related