How to extract text from html in Python with BeautifulSoup4-CodePudding

I am trying to extract its text i.e only the filename from the below html tags

So in the end I would like to have output as below-

BeforeStructure.PNG
AfterStructure.PNG

Can you please guide how to I extract only the file name from the below code, Thanks!

<div><br> </div><div><img src=\"https://azure.com/8dd91aab-0dce-41e2-95d4-eb69e06c68fe/_apis/wit/attachments/c4e5aab1-0877-44ea-ad2d-2c614ac56984?fileName=BeforeStructure.PNG\" alt=BeforeStructure.PNG><br> </div><div><br> </div><div><img src=\"https://azure.com/8dd91aab-0dce-41e2-95d4-eb69e06c68fe/_apis/wit/attachments/af842f67-1a3d-48dc-8c8b-396a28b306ce?fileName=AfterStructure.PNG\" alt=AfterStructure.PNG><br> </div>"

CodePudding user response：

Instead of trying to extract the file name from the link in the src attribute, you can just extract the name in the alt attribute.

for img in soup.find_all("img"):
  print(img.get("alt"))

Output -

BeforeStructure.PNG
AfterStructure.PNG

CodePudding user response：

File_name = soup.find(‘file name’).getText()

This works for 1 find, if you want to find in multiple locations then use find all