Home > Mobile >  Scrape data the first span instance of same div class name
Scrape data the first span instance of same div class name

Time:09-26

I want to scrape the data of the first span instance of divs with similar names in Python using BeautifulSoup (BS4). Here is the HTML code:

<div >
                <h3>Network Overview</h3>
                <div >
                    <div ><span>32</span><span>Games</span></div>
                    <div ><span>83,681,202,831.85</span><span>Award</span></div>
                    <div ><span>18</span><span>Top players</span></div>
                </div>
</div>

The HTML above code is from a website, I just copied the portion from which I want to scrape the data. For example my scraped data should look like:

32

83,681,202,831.85

18

I am new to Python data scrapping & I've tried the code below but failed:

soup = BeautifulSoup(requests.get(url).content, 'html.parser')

div = soup.find_all("div", class_="networkstat")
value1 = div[0].find("span").get_text().strip()
value2 = div[1].find("span").get_text().strip()
value3 = div[2].find("span").get_text().strip()

print(value1, value2, value3)

Any help is appreciated.

CodePudding user response:

You could use css selectors to select the first span child of div:

for e in soup.select('.networkstat span:first-child'):
    print(e.get_text())

Example

from bs4 import BeautifulSoup

html = '''
<div >
                <h3>Network Overview</h3>
                <div >
                    <div ><span>32</span><span>Games</span></div>
                    <div ><span>83,681,202,831.85</span><span>Award</span></div>
                    <div ><span>18</span><span>Top players</span></div>
                </div>
</div>
'''
soup = BeautifulSoup(html)
for e in soup.select('.networkstat span:first-child'):
    print(e.get_text())

CodePudding user response:

your code should work. the only problem is this requests.get(url).content, - you use .content but you should use .text
try this soup = BeautifulSoup(requests.get(url).text, 'html.parser')

  • Related