Home > Enterprise >  How to scrape last string of <p> tag element?
How to scrape last string of <p> tag element?

Time:01-13

To start, python is my first language I am learning. I am scraping a website for rent prices across my city and I am using BeautifulSoup to get the price data, but I am unable to get the value of this

tag.

Here is the tag:

<p><strong >Monthly Rent: </strong>2,450  </p>

Here is my code:

text = soup.find_all("div", {"class", "plan-group rent"})
for item in text:
    rent = item.find_all("p")
    for price in rent:
        print(price.string)

I also tried:

text = soup.find_all("div", {"class", "plan-group rent"})
for item in text:
    rent = item.find_all("p")
    for price in rent:
        items = price.find_all("strong")
        for item in items:
            print('item.string')

and that works to print out "Monthly Rent:" but I don't understand why I can't get the actual price. The above code shows me that the monthly rent is in the strong tag, which means that the p tag only contains the price which is what I want.

CodePudding user response:

As mentioned by @kyrony there are two children in your <p> - Cause you select the <strong> you will only get on of the texts.

You could use different approaches stripped_strings:

list(soup.p.stripped_strings)[-1]

or contents

soup.p.contents[-1]

or with recursive argument

soup.p.find(text=True, recursive=False)

Example

from bs4 import BeautifulSoup
html = '''<p><strong >Monthly Rent: </strong>2,450  </p>'''
soup = BeautifulSoup(html)

soup.p.contents[-1]

CodePudding user response:

Technically your content has two children

<p><strong >Monthly Rent: </strong>2,450  </p>

A strong tag

<strong >Monthly Rent: </strong>

and a string

2,450  

The string method in beautiful soup only takes one argument so its going to return None. In order to get the second string you need to use the stripped_strings generator.

  • Related