Home > Enterprise >  BeautifulSoup: get data in class that change name
BeautifulSoup: get data in class that change name

Time:06-28

I'm using BeautifulSoup to scrape an html page where the information I need are stored in a code like this:

<a  href="site0.html"> Title 0 </a>
<a  href="site1.html"> Title 1 </a>
<a  href="site2.html"> Title 2 </a>
[...]

I'd like to get "Title 0", "Title 1" and "Title 2" but the class name change for each item, so I'm using regex like this:

titles = soup.findAll("a", attrs={"class": re.compile('^TitleonContext.*')})


for title in titles:
   print(title)

But it's not working (nothing is printed). What am I doing wrong?

CodePudding user response:

Try using the following regex instead re.compile(r'.*TitleonContext') or re.compile('.*TitleonContext'), otherwise you're looking for this value to be started with (^).

  • Related