Web scrape second number between tags-CodePudding

I am new to Python, and never done HTML. So any help would be appreciated. I need to extract two numbers: '1062' and '348', from a website's inspect element. This is my code:

page = requests.get("https://www.traderscockpit.com/?pageView=live-nse-advance-decline-ratio-chart")

soup = BeautifulSoup(page.content, 'html.parser')

Adv = soup.select_one ('.col-sm-6 .advDec:nth-child(1)').text[10:]

Dec = soup.select_two ('.col-sm-6 .advDec:nth-child(2)').text[10:]

The website element looks like below:

<div >
            <div >
                <div >
                    <h4>Stocks</h4>
                </div>
                <div >
                    <p ><a href="/?pageView=nse-top-gainers" title="Click to view list of Advanced stocks">Advanced:</a> 1062</p>
                </div>
                <div >
                    <p ><a href="/?pageView=nse-top-losers" title="Click to view list of Declined stocks">Declined:</a> 348</p>
                </div>
            </div>
        </div>

Using my code, am able to extract first number (1062). But unable to extract the second number (348). Can you please help.

CodePudding user response：

Assuming the Pattern is always the same, you can select your elements by text and get its next_sibling:

adv = soup.select_one('a:-soup-contains("Advanced:")').next_sibling.strip()
dec = soup.select_one('a:-soup-contains("Declined:")').next_sibling.strip()

Example

import requests
from bs4 import BeautifulSoup

page = requests.get("https://www.traderscockpit.com/?pageView=live-nse-advance-decline-ratio-chart")
soup = BeautifulSoup(page.content)

adv = soup.select_one('a:-soup-contains("Advanced:")').next_sibling.strip()
dec = soup.select_one('a:-soup-contains("Declined:")').next_sibling.strip()

print(adv, dec)

CodePudding user response：

If there are always 2 elements, then the simplest way would probably be to destructure the array of selected elements.

import requests
from bs4 import BeautifulSoup

page = requests.get("https://www.traderscockpit.com/?pageView=live-nse-advance-decline-ratio-chart")
soup = BeautifulSoup(page.content, "html.parser")

adv, dec = [elm.next_sibling.strip() for elm in soup.select(".advDec a") ]
print("Advanced:", adv)
print("Declined", dec)