Extracting text in of between specific <span> </span>-CodePudding

I am writing my first (not print("hello world") python program but found a problem extracting what I really need from output. I checked many similar topics but it didn't work as expected.

page = requests.get(
"https://www.oferty.net/statystyki/012022/mieszkania-sprzedaz-" cityChoose)
soup = BeautifulSoup(page.content, 'html.parser')
content = soup.find('div', {"class": "subtitle2"})

print(content)

And I get:

<div >
<span>
<span>Warszawa</span> - średnia cena 1m² w styczniu 2022 = 13 405 PLN </span>
</div>

How can I choose particular text to make it look like that?

Warszawa - średnia cena 1m² w styczniu 2022 = 13 405 PLN

CodePudding user response：

Then you can find first span and get all text inside, as follows:

import requests
from bs4 import BeautifulSoup
page = requests.get("https://www.oferty.net/statystyki/012022/mieszkania-sprzedaz-" cityChoose)
soup = BeautifulSoup(page.content, 'html.parser')
content = soup.find('div', {"class": "subtitle2"})

result = content.find('span').text
print(result)
# Warszawa - średnia cena 1m² w styczniu 2022 = 13 405 PLN

CodePudding user response：

# import requests
from bs4 import BeautifulSoup as bso
sample_content = """<html>
<head>MyApp</head>
<body>
<div >
<span>
<span>Warszawa</span> - średnia cena 1m² w styczniu 2022 = 13 405 PLN </span>
</div>
</body>
</html>"""
soup = bso(sample_content, 'html.parser')
content = soup.find('div', {"class": "subtitle2"})
print(''.join([t.text for t in content.findChildren()]))

Simple example here

CodePudding user response：

Thank you guys! I tried your solution but for some reason didn't work. Maybe I did something wrong but found the way by just adding .text to print command as below:

page = requests.get(
"https://www.oferty.net/statystyki/012022/mieszkania-sprzedaz-" cityChoose)
soup = BeautifulSoup(page.content, 'html.parser')
content = soup.find('div', {"class": "subtitle2"})
print(content.text)