BeautifulSoup as stopped returning the tag when I use find(text="example")-CodePudding

I have a scraper that I've been using for like a year without issue. What I do is I find a specific element by searching for it's text. Now usually when I do so with find(text="Cost of gas per GJ") it would return the whole tag, now it only returns the text.

rate_2_response = requests.get("https://www.fortisbc.com/accounts-billing/billing-rates/natural-gas-rates/business-rates#tab-0")
# print(rate_2_response.text)

rate_2_soup = BeautifulSoup(rate_2_response.text, "html.parser")
rate_2_table = rate_2_soup.find("body")
print(rate_2_table)
cost_of_gas_gj = rate_2_table.find(text="Cost of gas per GJ") ## problematic line
print(cost_of_gas_gj)

rate_2_table returns a whole long list of elements (which contains what I need), so there's no problem there. But find() seems to not parse it correctly.

I need cost_of_gas_gj to return <td width="70%">Cost of gas per GJ</td> not just the inner text. The website has not changed.

CodePudding user response：

Try this:

import requests
from bs4 import BeautifulSoup

url = "https://www.fortisbc.com/accounts-billing/billing-rates/natural-gas-rates/business-rates#tab-0"
rate_2_soup = BeautifulSoup(requests.get(url).text, "html.parser")
cost_of_gas_gj = rate_2_soup.find("td", text="Cost of gas per GJ")
print(cost_of_gas_gj)

Output:

<td width="70%">Cost of gas per GJ</td>

CodePudding user response：

If you don't know, what tag has specified text, use .parent:

...

cost_of_gas_gj = rate_2_table.find(text="Cost of gas per GJ").parent

# Output: <td width="70%">Cost of gas per GJ</td>
print(cost_of_gas_gj)

...