Home > database >  Convert list of lists of bs4.element.ResultsSets into a list of strings
Convert list of lists of bs4.element.ResultsSets into a list of strings

Time:03-25

As I am working on scraping a website, I have found myself with a list of lists containing bs4.element.ResulstSets from a find_all() search. Therefore the output of my variable (features) looks like this

[[<li >Terrasse</li>],
[<li >Balkon</li>, <li >Video Live-Besichtigung</li>],
[<li >Balkon</li>, <li >Terrasse</li>],
[],
[<li >Neubauprojekt</li>, <li >Garten</li>, <li >Balkon</li>, <li >Terrasse</li>]]

I would now like to turn this into a list of strings like this:

["Terrasse",
"Balkin, Video-Besichtigung",
"Balkon, Terrasse",
"",
"Neubauprojekt, Garten, Balkon, Terrasse"]

I have tried many things but sometimes the empty list is not converted to an empty string or the order is not consistent. Thank you very much in advance.

CodePudding user response:

You can do that applying list comprehension as follows

txt='''
[[<li >Terrasse</li>],
[<li >Balkon</li>, <li >Video Live-Besichtigung</li>],
[<li >Balkon</li>, <li >Terrasse</li>],
[],
[<li >Neubauprojekt</li>, <li >Garten</li>, <li >Balkon</li>, <li >Terrasse</li>]]
'''
from bs4 import BeautifulSoup


soup = BeautifulSoup(txt,'lxml')

data=[x.get_text() for x in soup.find_all("li", class_="WlSsj")]
print(data)

Output:

['Terrasse', 'Balkon', 'Video Live-Besichtigung', 'Balkon', 'Terrasse', 'Neubauprojekt', 'Garten', 'Balkon', 'Terrasse']

CodePudding user response:

You need to use nested for loops and extract the text of that element, in your case:

inner_text = list()
# features is the list of lists you get with bs4
for lis in features:
    text_to_append = ""
    for item in lis:
        text_to_append = f"{text_to_append}, {item.text}"
    
    inner_text.append(text_to_append)
  • Related