<span >Dominus Estate</span> Napa Valley
name_list = []
name_tags = soup.find_all("class","sort-text")
for name in name_tags:
name = name.get_text()
name_list.append(name)
print(name_list)
Dominus Estate
but I want the following:
Dominus Estate Napa Valley
CodePudding user response:
You can go with .next_sibling
to get the text after the span
and concat both strings:
name_tags = soup.find_all('span',{"class","sort-text"})
for name in name_tags:
name = f"{name.get_text()} {name.next_sibling.get_text(strip=True)}"
Example
from bs4 import BeautifulSoup
html='''
<span >
<span >Dominus Estate</span> Napa Valley
<span >2018</span>
<span > </span>
<span >-</span>
</span>
<span >
<span >Château Pichon Longueville Lalande</span> Pauillac
<span >2018</span>
<span > </span>
<span >-</span>
</span>
'''
soup = BeautifulSoup(html)
name_list = []
name_tags = soup.find_all('span',{"class","sort-text"})
for name in name_tags:
name = f"{name.get_text()} {name.next_sibling.get_text(strip=True)}"
name_list.append(name)
print(name_list)
Output
['Dominus Estate Napa Valley', 'Château Pichon Longueville Lalande Pauillac',...]