I want to select all values that belong to a specific tag using the selector function, however it seems to return only the first value given that the tag is repeated more than once. Shouldn't it select all the values under all the tags when repeated?
For example:
url = 'https://findthatlocation.com/film-title/a-knights-tale'
r = requests.get(url[1])
soup = BeautifulSoup(r.content, 'lxml')
street = list(soup.select_one("div[style='color: #999; font-size: 12px; margin-bottom: 5px;']").stripped_strings)
#or
st = soup.find('div', {'style':'color: #999; font-size: 12px; margin-bottom: 5px;'})
Only returns:
#street.text.strip()
['Prague,']
st.text.strip()
Prague,
However, more than one of the tag appears in the webpage, so I was expecting something like this:
#when using street.text.strip()
['Prague,', 'Prague Castle, Prague']
CodePudding user response:
Use .select
, not .select_one
:
import requests
from bs4 import BeautifulSoup
url = "https://findthatlocation.com/film-title/a-knights-tale"
r = requests.get(url)
soup = BeautifulSoup(r.content, "lxml")
out = [d.get_text(strip=True) for d in soup.select("h3 div")]
print(out)
Prints:
['Prague,', 'Prague Castle, Prague']
CodePudding user response:
code:
url = 'https://findthatlocation.com/film-title/a-knights-tale'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'lxml')
# use select instead select_one
st = soup.select("div[style='color: #999; font-size: 12px; margin-bottom: 5px;']")
box=[]
for i in st:
box.append(i.text.strip())
box
return:
['Prague,', 'Prague Castle, Prague']