Getting 'NoneType' object has no attribute 'text'
error while scraping a web page using beautifulSoup.
The part of html document looks like this:
<div class="ntb boy">
<ol>...</ol>
<ul class="nbd">
<li class="ntr" data-id="bwjleo">
<i class="nvt">...</i>
<dl class="nem">
<dt class="nvar">
<b>
<a href="https://www.babynamesdirect.com/boy/aak" title="Meaning and more details of Aak">
Aak
</a>
</b>
</dt>
<dd class="ndfn">
A Nature; Sky
</dd>
</dl>
<em class="narr">
</em>
</li>
<li>...</li>
<li>...</li>
.
.
</ul>
</div>
The code used to extract the names("Aak" in the above html):
res = requests.get('https://www.babynamesdirect.com/baby-names/indian/boy/trending')
soup = BeautifulSoup(res.text, 'html5lib')
ul = soup.find('div', class_ = 'ntb boy').find_all('li')
names = [name.dt.text for name in ul]
print(names)
When I try to print name.dt
, I get bs4.element.Tag
. But name.dt.text give AttributeError: 'NoneType' object has no attribute 'text'
.
CodePudding user response:
You are getting the error because there are certain None values if you try to scrape directly.
For eg: After the name Naksh there is a empty field which gives error.
you can try this to solve your error.
res = requests.get('https://www.babynamesdirect.com/baby-names/indian/boy/trending')
soup = BeautifulSoup(res.text, 'html5lib')
ul = soup.find('div', class_ = ['ntb','boy']).find_all('li')
for name in ul:
try:
print(name.dt.a.text)
except:
pass