Home > Software engineering >  Get href value BeautifulSoup
Get href value BeautifulSoup

Time:11-12

How to get href value if there is any child link in a link?

Code:

a_links = []

for link in links:
    response = requests.get(link)
    soup_link = BeautifulSoup(response.text, 'lxml')
    a_cont = soup_link.find_all('div', class_= 'detail__anchor-numb')
    for a in a_cont.find_all('a'):
        a_link = a['href']
        a_links.append(a_link)

Output:

AttributeError: ResultSet object has no attribute 'find_all'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?

CodePudding user response:

I believe your error is in:

 a_cont = soup_link.find_all('div', class_= 'detail__anchor-numb')
 for a in a_cont.find_all('a'):
    ...

a_cont is going to be an iterable since you called find_all. If you meant it to be a single object try calling find instead.

Otherwise, the simplest answer here is going to be able to loop over a_cont though your code will start to become very nested. You might consider a refactor after this.

Example code:

a_cont = soup_link.find_all('div', class_= 'detail__anchor-numb')
for div in a_cont:
    for a in div.find_all('a'):

Notice that this is indicated in the error message. Python and its widely used packages are very good about hinting at where you might have gone wrong. Paying very close attention to what they say will be very helpful in fixing bugs like this one.

CodePudding user response:

You can apply css selectors to grab href as follows:

a_links = []

for link in links:
    response = requests.get(link)
    soup_link = BeautifulSoup(response.text, 'lxml')
    a_cont = soup_link.select('div.detail__anchor-numb a')
    for a in a_cont:
        a_link = a['href']
        a_links.append(a_link)
  • Related