How to extract class name as string from first element only?-CodePudding

New to python and I have been using this piece of code in order to get the class name as a text for my csv but can't make it to only extract the first one. Do you have any idea how to ?

    for x in book_url_soup.findAll('p', class_="star-rating"):
        for k, v in x.attrs.items():
            review = v[1]
            reviews.append(review)
            del reviews[1]
            print(review)

the url is : http://books.toscrape.com/catalogue/its-only-the-himalayas_981/index.html

the output is:

Two
Two
One
One
Three
Five
Five

I only need the first output and don't know how to prevent the code from getting the "star ratings" from below the page that shares the same class name.

CodePudding user response：

Instead of find_all() that will create a ResultSet you could use find() or select_one() to select only the first occurrence of your element and pick the last index from the list of class names:

soup.find('p', class_='star-rating').get('class')[-1]

or with css selector

soup.select_one('p.star-rating').get('class')[-1]

In newer code also avoid old syntax findAll() instead use find_all() - For more take a minute to check docs

Example

from bs4 import BeautifulSoup
import requests

url = 'http://books.toscrape.com/catalogue/its-only-the-himalayas_981/index.html'
page = requests.get(url).text
soup = BeautifulSoup(page)

soup.find('p', class_='star-rating').get('class')[-1]

Output

Two