Home > Software design >  How can I get both tag?
How can I get both tag?

Time:07-05

<span >Dominus Estate</span> Napa Valley
name_list = []

name_tags = soup.find_all("class","sort-text")

for name in name_tags:

    name = name.get_text()

    name_list.append(name)

print(name_list)

Dominus Estate

but I want the following:

Dominus Estate Napa Valley

CodePudding user response:

You can go with .next_sibling to get the text after the span and concat both strings:

name_tags = soup.find_all('span',{"class","sort-text"})
for name in name_tags:
    name = f"{name.get_text()} {name.next_sibling.get_text(strip=True)}"
Example
from bs4 import BeautifulSoup

html='''
<span >
    <span >Dominus Estate</span> Napa Valley 
    <span >2018</span> 
    <span > </span>
    <span >-</span>
</span>
<span >
    <span >Château Pichon Longueville Lalande</span> Pauillac 
    <span >2018</span> 
    <span > </span>
    <span >-</span>
</span>
'''
soup = BeautifulSoup(html)
name_list = []
name_tags = soup.find_all('span',{"class","sort-text"})

for name in name_tags: 
    name = f"{name.get_text()} {name.next_sibling.get_text(strip=True)}"
    name_list.append(name)

print(name_list)
Output
['Dominus Estate Napa Valley', 'Château Pichon Longueville Lalande Pauillac',...]
  • Related