I read a book that has the following:
from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen('http://www.pythonscraping.com/pages/page3.html')
bs = BeautifulSoup(html, 'html.parser')
for child in bs.find('table',{'id':'giftList'}).children:
print(child)
This code prints the list of product rows in the giftList table, including the initial row of column labels. If you were to write it using the descendants() function instead of the children() function, about two dozen tags would be found within the table and printed, including img tags, span tags, and individual td tags.
I tested it and I did not see the two outputs had difference when using .children or .descendants. Can anyone please tell me what exactly it will print when using .children and using .descendants.
CodePudding user response:
the difference lies in depth level. children
will go to one level of depth max. descendants
will print everything, going to the maximum depth every time.
if we take that excerpt from the sisters.html
of the beautifulsoup docs
<p ><b>The Dormouse's story</b></p>
for child in p.children:
print(child)
>>> <b>
for child in p.descendants:
print(child)
>>> <b>
>>> "The Dormouse's story"