Home > Back-end >  Python Convert BeautifulSoup Output to Set?
Python Convert BeautifulSoup Output to Set?

Time:06-14

In python I have:

def tag_visible(element):
    return True


def get_visible_text(soup):
    text_tags = soup.find_all(text=True)
    visible_texts = filter(tag_visible, text_tags)
    stripped = set()
    for text in visible_texts:
        stripped.add(text.strip())
    return stripped

I have 2 questions:

  1. How to convert visible_texts into set in one line?

  2. Is there a data structure in python like set (no duplicates) and preserves order of elements?


UPDATE:

I can do:

return set(visible_texts)

But how to apply strip function?

CodePudding user response:

dicts preserve insertion order. dicts contain key-value pairs. In this case, you don't care about the value, so it's always set to True.

I'm not too sure what you are trying to achieve by using filter with a function that always returns True. Please clarify.

def get_visible_text(soup):
    text_tags = soup.find_all(text=True)
    return dict((text.strip(), True) for text in text_tags)

You can apply the strip function by using a set comprehension:

return set(text.strip() for text in visible_texts)

Note, however, that the insertion order is not necessarily preserved.

  • Related