Home > Back-end >  Using for loop with bs4.element.tag in Python
Using for loop with bs4.element.tag in Python

Time:10-13

I'm learning the basics of web-scraping and am using Indeed as my testing ground.

I'm excluding sections of my code that I'm happy with to avoid a lengthy post. The "indeed(dot)com" portion of the print statement will be substituted with "site" so my post does not get auto-removed or flagged.

the related_jobs variable is of type bs4.element.tag. My code is as follows:

for job in jobs:
     related_jobs = job.find('span', class_ = 'mat')
     print(f"All Postings By {company_name}: site{related_jobs}")

Here is one of the outputs for the print statement: print_statement_screencap

My issue is as follows: I want to append the "site" variable to the first 'a' tag in the span tag and when I try and implement that this way:

related_jobs = job.find('span', class_ = 'mat').a['href']

, the output is exactly how I want it but does not continue after the first listing. I receive this error: "AttributeError: 'NoneType' object has no attribute 'a'".

My Question: Is there a way to have my for loop continue throughout the entire listing of the page? If not, is there a string method that I can use to grab the first a tag?

CodePudding user response:

Some iterations in your loop find no jobs. What you can do is either use try/except statement or do an if check:

for job in jobs:
    related_jobs = job.find('span', class_ = 'mat')
    if not related_jobs:
        # no jobs - skip this iteration
        continue
    related_jobs = related_jobs.a['href']
    print(f"All Postings By {company_name}: site{related_jobs}")

or more pythonic approach is to use try/except statement:

for job in jobs:
    try:
        related_jobs = job.find('span', class_ = 'mat').a['href']
    except (AttributeError, KeyError):
        continue
    print(f"All Postings By {company_name}: site{related_jobs}")
  • Related