Home > Blockchain >  Can't figure out how to scrape an ID with beautifulsoup
Can't figure out how to scrape an ID with beautifulsoup

Time:06-12

Trying to scrape a site with an ID but I can't figure out how to fix it:

from bs4 import BeautifulSoup
import requests

url= "Website"
page= requests.get(url)

soup = BeautifulSoup(page.content, 'html.parser')
lists = soup.find_all ('div', class_="position-relative")

for list in lists:
    Value = list.find('h5', id_= "player_value")
print (Value)

Now with that it will just print:

None

Here is what the website inspect mode looks like:

1

CodePudding user response:

You need to pass the class in a dict, try that:

lists = soup.find_all ('div', {'class': 'position-relative'})

CodePudding user response:

Remove the _ from attribute parameter id:

.find('h5', id= "player_value")

Why _ is needed for the class from the docs:

“class”, is a reserved word in Python. Using class as a keyword argument will give you a syntax error. As of Beautiful Soup 4.1.2, you can search by CSS class using the keyword argument class_

Example

Assuming that there is an unique id you could get your value directly:

from bs4 import BeautifulSoup

html='''
<h5 id="player_value">1</h5>
'''
soup = BeautifulSoup(html)

player_value = soup.find('h5', id= "player_value").text
print(player_value)

If the id of your <h5> is not unique and you want to get all - Avoid also to use other reserved words like list:

from bs4 import BeautifulSoup

html='''
<h5 id="player_value">1</h5>
<h5 id="player_value">2</h5>
<h5 id="player_value">3</h5>
<h5 id="player_value">4</h5>
'''

soup = BeautifulSoup(html)

for l in soup.find_all('h5', id = "player_value"):
    print (l.text)
  • Related