Home > front end >  Is there a way to find tags in BeautifulSoup that do not contain a specific class?
Is there a way to find tags in BeautifulSoup that do not contain a specific class?

Time:10-20

I'm trying to scrape a table on a page that has classes on each row. There are some classes that signify that the event has yet to take place and I want to avoid these. The table is similar to this:

<tr class="TRow1 TFuture">
<tr class="TRow2 TFuture">
<tr class="TRow1 TFuture">
<tr class="TRow2 TPresent">
<tr class="TRow1 TPast">
<tr class="TRow2">

All I seem to be able to find is how to select a class that I want. Is there any way to select everything except for a class I don't want?

CodePudding user response:

You can use the :not css selector:

from bs4 import BeautifulSoup as soup
s = """ 
<tr ></tr>
<tr ></tr>
<tr ></tr>
<tr ></tr>
<tr ></tr>
<tr ></tr>
"""    
tr = soup(s, 'html.parser').select('tr:not(.TFuture)')

Output:

[<tr class="TRow2 TPresent"></tr>, <tr class="TRow1 TPast"></tr>, <tr class="TRow2"></tr>]
  • Related