How to get value of html tags with beautifulsoup in python?-CodePudding

I scraped multiple pages, some of which have a red class, some of which I do not want to store red class values in an array, but I want those pages that do not have this class to be in an empty array. Because of that I wrote this code and now I want to get value of them. can you help me?

    for i in soup:
        search = i.find_all('div', {'class':"red"})
        if len(search)>0:
            whoFollowThisDr.append(i.find_all('div', {'class':"info"},'span'))
            i = i.text
        else:
            whoFollowThisDr.append(' ')
whoFollowThisDr

output:

[[<div class="info"> <strong> a</strong> <span>b</span> </div>,
  <div class="info"> <strong> c</strong> <span>d</span> </div>,
  <div class="info"> <strong style="font-size: 15px !important;"> e</strong> <span style="font-size: 12px !important;">f</span> </div>,
  <div class="info"> <strong style="font-size: 15px !important;"> g</strong> <span style="font-size: 12px !important;">h</span> </div>],
 [<div class="info"> <strong> i</strong> <span>j</span> </div>]]

What I want:

[[a,c,e],[i]]

CodePudding user response：

i = i.text has no effect, since you never use i after the assignment. You should use .text when you're appending to the list. Use a list comprehension to call it on each element.

whoFollowThisDr.append([div.text for div in i.find_all('div', {'class':"info"},'span')])