Home > Net >  Conditional operators in Beautiful Soup findAll by attribute value
Conditional operators in Beautiful Soup findAll by attribute value

Time:09-03

I want to find all of tds that don't have a custom html attribute data-stat="randomValue"
My data looks something like this:

<td data-stat="foo">10</td>
<td data-stat="bar">20</td>
<td data-stat="test">30</td>
<td data-stat="DUMMY"> </td>

I know that I can just select for foo, bar, and test but my actual dataset will have hunders of different values for data-set so it just wouldn't be feasible to code.

Is there something like a != operator that I can use in beautiful soup? I tried doing:

[td.getText() for td in rows[i].findAll('td:not([data-stat="DUMMY"])')]

but I only get [] as a value.

CodePudding user response:

You can use list comprehension to filter out the unvanted tags, for example:

print([td.text for td in soup.find_all("td") if td.get("data-stat") != "DUMMY"])

Or use CSS selector with .select (as @Barmar said in comments, .find_all doesn't accept CSS selectors):

print([td.text for td in soup.select('td:not([data-stat="DUMMY"])')])
  • Related