Home > other >  BeautifulSoup: AttributeError: 'NoneType' object has no attribute 'text'
BeautifulSoup: AttributeError: 'NoneType' object has no attribute 'text'

Time:11-05

Getting 'NoneType' object has no attribute 'text' error while scraping a web page using beautifulSoup.

The part of html document looks like this:

<div class="ntb boy">
 <ol>...</ol>
 <ul class="nbd">
  <li class="ntr" data-id="bwjleo">
   <i class="nvt">...</i>
   <dl class="nem">
    <dt class="nvar">
     <b>
      <a href="https://www.babynamesdirect.com/boy/aak" title="Meaning and more details of Aak">
       Aak
      </a>
     </b>
    </dt>
    <dd class="ndfn">
     A Nature; Sky
    </dd>
   </dl>
   <em class="narr">
   </em>
  </li>
  <li>...</li>
  <li>...</li>
       .
       .
 </ul>
</div>

The code used to extract the names("Aak" in the above html):

res = requests.get('https://www.babynamesdirect.com/baby-names/indian/boy/trending')
soup = BeautifulSoup(res.text, 'html5lib')
ul  = soup.find('div', class_ = 'ntb boy').find_all('li')
names = [name.dt.text for name in ul]
print(names)

When I try to print name.dt, I get bs4.element.Tag. But name.dt.text give AttributeError: 'NoneType' object has no attribute 'text'.

CodePudding user response:

You are getting the error because there are certain None values if you try to scrape directly.

For eg: After the name Naksh there is a empty field which gives error.

you can try this to solve your error.

res = requests.get('https://www.babynamesdirect.com/baby-names/indian/boy/trending')
soup = BeautifulSoup(res.text, 'html5lib')
ul  = soup.find('div', class_ = ['ntb','boy']).find_all('li')
for name in ul:
    try:
        print(name.dt.a.text)
    except:
        pass
  • Related