I scraped multiple pages, some of which have a red class, some of which I do not want to store red class values in an array, but I want those pages that do not have this class to be in an empty array. Because of that I wrote this code and now I want to get value of them. can you help me?
for i in soup:
search = i.find_all('div', {'class':"red"})
if len(search)>0:
whoFollowThisDr.append(i.find_all('div', {'class':"info"},'span'))
i = i.text
else:
whoFollowThisDr.append(' ')
whoFollowThisDr
output:
[[<div class="info"> <strong> a</strong> <span>b</span> </div>,
<div class="info"> <strong> c</strong> <span>d</span> </div>,
<div class="info"> <strong style="font-size: 15px !important;"> e</strong> <span style="font-size: 12px !important;">f</span> </div>,
<div class="info"> <strong style="font-size: 15px !important;"> g</strong> <span style="font-size: 12px !important;">h</span> </div>],
[<div class="info"> <strong> i</strong> <span>j</span> </div>]]
What I want:
[[a,c,e],[i]]
CodePudding user response:
i = i.text
has no effect, since you never use i
after the assignment. You should use .text
when you're appending to the list. Use a list comprehension to call it on each element.
whoFollowThisDr.append([div.text for div in i.find_all('div', {'class':"info"},'span')])