In Python/BeautifulSoup, below code title
values is
<span ><!--F#f_7[0]-->4K Photon MONO<!--F/--></span>
when use title.get_text()
to get text 4K Photon MONO
, it failed .Any can help ? Thanks!
import requests
from bs4 import BeautifulSoup
url='https://www.ebay.com/itm/284163810059'
req=requests.get(url)
soup=BeautifulSoup(req.text,'lxml')
title=soup.select('div > div:nth-child(2) > div:nth-child(4) > div > span > div > span')
title_text= title.get_text()
CodePudding user response:
this is happening because select, returns a list, not a single string solve:
import requests
from bs4 import BeautifulSoup
url='https://www.ebay.com/itm/284163810059'
req=requests.get(url)
soup=BeautifulSoup(req.text,'lxml')
title=soup.select('div > div:nth-child(2) > div:nth-child(4) > div > span > div > span')
text = ''.join(list(map(lambda t: t.get_text(),title)))
print(text)
CodePudding user response:
It can also be done using soup.find
function.
import requests
from bs4 import BeautifulSoup
url='https://www.ebay.com/itm/284163810059'
req=requests.get(url)
soup=BeautifulSoup(req.text,'lxml')
title=soup.find("span", {"itemprop" : "model"})
title_text= "" if title is None else title.get_text()
CodePudding user response:
Main issue is that you use select
that will return a ResultSet
and you are not able to use get_text()
or text
until you iterat it and call the method on each element. Another issue is your selection, it could be more specific.
So how to fix?
Instead of select()
use select_one()
to call your get_text()
directly:
soup.select_one('[itemprop="model"]')
Be aware that you always should check that an element your try to select is available:
title = e.get_text() if (e:= soup.select_one('[itemprop="model"]')) else None
Note: walrus operator requires python 3.8 or higher
Alternative for python <3.8:
title = soup.select_one('[itemprop="model"]').get_text() if soup.select_one('[itemprop="model"]') else None
Example
import requests
from bs4 import BeautifulSoup
url='https://www.ebay.com/itm/284163810059'
req=requests.get(url)
soup=BeautifulSoup(req.text)
title = e.get_text() if (e:= soup.select_one('[itemprop="model"]')) else None
title
Output
4K Photon MONO