It is giving me output with html
tag but i dont need html tag.
Getting the text is throwing AttributeError:
'NoneType' object has no attribute 'get_text'
import requests
from bs4 import BeautifulSoup
url = requests.get("https://in.indeed.com/jobs?q=python developer&l=")
soup = BeautifulSoup(url.content,"html.parser")
parsed_file = soup.find(id = "resultsBody")
items = parsed_file.find_all(class_="slider_container")
for item in items:
job_title = item.find(title='Python Developer').get_text()
print(job_title)
CodePudding user response:
Since you only want to print out the jobs whose title is Python Developer, you need to first check if a job with such a title exists - That is .find()
should not return None
.
Just put this check inside your for-loop
.
job_title = item.find(title='Python Developer')
# If job_title is not None, print the text
if job_title:
print(job_title.get_text())
CodePudding user response:
.get_text()
only works if there is a result with your selection for a title. To fix the process first check if result is not None
:
for item in items:
job_title = item.find(title='Python Developer').get_text() if item.find(title='Python Developer') else 'no result'
print(job_title)
Hint
Your selection could be more focused, so your are able to loop more efficient over the cards and also scrap additional info:
soup.select('#mosaic-provider-jobcards > a')
Example
import requests
from bs4 import BeautifulSoup
url = requests.get("https://in.indeed.com/jobs?q=python developer&l=")
soup = BeautifulSoup(url.content,"html.parser")
data = []
for item in soup.select('#mosaic-provider-jobcards > a'):
if item.find(title='Python Developer'):
data.append({
'title':item.h2.get_text(),
'company':item.a.get_text(),
'...':'...'
})
data