Home > Enterprise >  Extract Specific Text in BS
Extract Specific Text in BS

Time:07-07

I can't scrap the text after the "Product Description".

http://books.toscrape.com/catalogue/1000-places-to-see-before-you-die_1/index.html

This is my code so far:

book_url = 'http://books.toscrape.com/catalogue/1000-places-to-see-before-you-die_1/index.html'
response = requests.get(book_url)
soup = BeautifulSoup(response.content, 'lxml')
book_body = soup.find('article', class_='product_page')

Should I extract all the "p" tags before the text?

CodePudding user response:

HTML IDs are unique (or at least should be), you should always use them when scraping if available, in your case the "product description" is under the id product_description:

<div id="product_description" >
            <h2>Product Description</h2>
        </div>

So, to find the id="product_description" use:

import requests
from bs4 import BeautifulSoup


book_url = 'http://books.toscrape.com/catalogue/1000-places-to-see-before-you-die_1/index.html'
response = requests.get(book_url)
soup = BeautifulSoup(response.content, 'lxml')
book_body = soup.find(id='product_description')
print(book_body.get_text(strip=True))
  • Related