I need to find href
for page which have title
that I given. For example I have title
Italy,
and on wikipedia I want to get href
for page,which have this title
. This is my code:
if status_code_finish == 200:
list_of_titles_finish = [title.get('title') for title in
soup_finish.find(f'title*="{finish}"]')]
arg finish is Italy
How can I do like:
title.get('title') for title in soup.finish.find(f'title= {finish}')
CodePudding user response:
You could use css selectors
to select your elements more specific and in a single statement, without concatenating several find()
or find_all()
- Simply use the attribute selector
and *
for contains:
pattern = 'Italy'
[a.get('title') for a in soup.select(f'a[title*="{pattern}"]')]
or with a list
of pattern:
pattern = ['Italy','Finland']
set(a.get('title') for p in pattern for a in soup.select(f'a[title*="{p}"]'))
Example
from bs4 import BeautifulSoup
html = '''
<a href="/wiki/Italy" title="Italy">Italy</a>
<a href="/wiki/Italy" title="Italy Finland">Italy Finland</a>
<a href="/wiki/Finland" title="Finland">Finland</a>
'''
soup = BeautifulSoup(html)
pattern = 'Italy'
[a.get('title') for a in soup.select(f'a[title*="{pattern}"]')]
CodePudding user response:
You can use soup.find_all()
with the constraint of title=finish
to make a list
of finish title. After that, you could just iterate through it.
CodePudding user response:
To find the title of a webpage using the soup.select method in Python, you can use the following code:
from bs4 import BeautifulSoup
import requests
url = 'https://www.example.com'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
title = soup.select('title')[0].get_text()
print(title)```
This code first imports the necessary modules, then it sends a GET request to the specified URL using the requests module. The response is then parsed using the BeautifulSoup module, and the title is selected using the soup.select method. The [0] at the end is used to select the first element in the list returned by soup.select. The get_text() method is then used to extract the text within the title tag.
Note that the above code assumes that there is only one title tag in the HTML document, if there is more than one title tags you need to loop through the title tags or you can select the specific tag by adding class or id or any other attributes to the select method.