Home > Back-end >  how to extract data from body for web scraping after requests.get()
how to extract data from body for web scraping after requests.get()

Time:08-24

here is my code


    SUSTAINABILITY = []
    
    response = requests.get(URL, timeout=15)
    page_src = response.text
    SUSTAINABILITY.append(page_src.count("sustainability"))

I am getting the HTML response from response.text and then I am checking how many times the word sustainability shows up. This code is working but I only want to count the word "sustainability" inside the body tag.

How can i extract data from the body tag and then do count() to see how many times the word "sustainability" occur?

CodePudding user response:

@andrej-kesely get a good advice.

from bs4 import BeautifulSoup

import requests


response = requests.get(URL, timeout=15)
# Make a "soup" from the response's text
soup = BeautifulSoup(response.text, 'html.parser')
# Take the <body> of HTML page as NavigableString (if I don't miss),
# convert it into string and count required string
print(str(soup.body).count("a"))

  • Related