Home > database >  Scraping text from a <span> from <span> but both have inner text
Scraping text from a <span> from <span> but both have inner text

Time:07-04

this is the html tags i want get text from its span

<span >
     <span > mobile </span>   
     ItsMobileNumber
</span>

so its one main span with span and some text 'ItsMobileNumber' i want get the 'ItsMobileNumber' but when i use get_text() it getting both text like this :

mobile
ItsMobileNumber

and this is my python code

print(title.find("span").get_text())

how can i get just 'ItsMobileNumber' not inner span text ?

CodePudding user response:

Try something like this:

from bs4 import BeautifulSoup as bs
soup = bs([your html file],'lxml')

data = soup.select("span.ms-2.d-flex")
for datum in data:
    print(list(datum.strings)[2].strip())

The output, based only on your sample html, should be

ItsMobileNumber

CodePudding user response:

Have you tried this?:

data = title.find("span").get_text()
number = [d.text for d in data]

Or,

import re
number = re.findall("[0-9] ",data)

If you want the main span as well, you can do:

main = data.contents[0]
  • Related