Home > Software design >  In Python/BeautifulSoup, get_text() failed
In Python/BeautifulSoup, get_text() failed

Time:07-16

In Python/BeautifulSoup, below code title values is

<span ><!--F#f_7[0]-->4K Photon MONO<!--F/--></span> 

when use title.get_text() to get text 4K Photon MONO, it failed .Any can help ? Thanks!

import requests
from  bs4 import BeautifulSoup 
url='https://www.ebay.com/itm/284163810059'
req=requests.get(url)
soup=BeautifulSoup(req.text,'lxml')
title=soup.select('div > div:nth-child(2) > div:nth-child(4) > div > span > div > span')
title_text= title.get_text()

CodePudding user response:

this is happening because select, returns a list, not a single string solve:

import requests
from  bs4 import BeautifulSoup 
url='https://www.ebay.com/itm/284163810059'
req=requests.get(url)
soup=BeautifulSoup(req.text,'lxml')
title=soup.select('div > div:nth-child(2) > div:nth-child(4) > div > span > div > span')
text = ''.join(list(map(lambda t: t.get_text(),title)))
print(text)

CodePudding user response:

It can also be done using soup.find function.

import requests
from  bs4 import BeautifulSoup 
url='https://www.ebay.com/itm/284163810059'
req=requests.get(url)
soup=BeautifulSoup(req.text,'lxml')
title=soup.find("span", {"itemprop" : "model"})
title_text= "" if title is None else title.get_text()

CodePudding user response:

Main issue is that you use select that will return a ResultSet and you are not able to use get_text() or text until you iterat it and call the method on each element. Another issue is your selection, it could be more specific.

So how to fix?

Instead of select() use select_one() to call your get_text() directly:

soup.select_one('[itemprop="model"]')

Be aware that you always should check that an element your try to select is available:

title = e.get_text() if (e:= soup.select_one('[itemprop="model"]')) else None

Note: walrus operator requires python 3.8 or higher

Alternative for python <3.8:

title = soup.select_one('[itemprop="model"]').get_text() if soup.select_one('[itemprop="model"]') else None
Example
import requests
from  bs4 import BeautifulSoup 
url='https://www.ebay.com/itm/284163810059'
req=requests.get(url)
soup=BeautifulSoup(req.text)
title = e.get_text() if (e:= soup.select_one('[itemprop="model"]')) else None
title
Output
4K Photon MONO
  • Related