This is the html code
<div aria-label="RM 6,000 a month" class="salary-snippet"><span>RM 6,000 a month</span></div>
I used like this
divs = soup.find_all('div', class_='job_seen_beacon')
for item in divs:
print(item.find('div', class_='salary-snippet'))
i got the result a list such as
<div aria-label="RM 3,500 to RM 8,000 a month" class="salary-snippet"><span>RM 3,500 - RM 8,000 a month</span></div>
if i used
print(item.find('div', class_='salary-snippet').text.strip())
it will return the error
AttributeError: 'NoneType' object has no attribute 'text'
so how can i get only the span text? its my first time web scraping
CodePudding user response:
I believe the line should be:
print(item.find('div', {'class':'salary-snippet'}).text.strip())
Alternatively, if there is only the span
you can simply use:
item.find("span").text.strip()
Considering you used the .find_all()
method you might want to ensure that every div
returned from your HTML
soup.find_all('div', class_='job_seen_beacon')
contains the element you are looking for as thi could arise if only one element doesn't.
i.e.
divs = soup.find_all('div', class_='job_seen_beacon')
for item in divs:
try:
print(item.find('div', {'class':'salary-snippet'}).text.strip())
except AttributeError:
print("Item Not available")
What this will do is try get the text but if this fails will print the item that failed so you can identify why... perhaps it doesn't have the element you are searching for.
CodePudding user response:
Here is the output of the span text in div tag
from bs4 import BeautifulSoup
html_doc="""
<div aria-label="RM 6,000 a month" >
<span>
RM 6,000 a month
</span>
</div>
"""
soup = BeautifulSoup(html_doc, 'html.parser')
#print(soup.prettify())
#for a list of containers
divs = soup.find_all('div', class_='salary-snippet')
for item in divs:
print(item.find('span').text)
#for a single container
#span = soup.find('div', class_='salary-snippet').find_next('span').text
#print(span)
Output:
RM 6,000 a month