I am trying to extract its text i.e only the filename from the below html tags
So in the end I would like to have output as below-
BeforeStructure.PNG
AfterStructure.PNG
Can you please guide how to I extract only the file name from the below code, Thanks!
<div><br> </div><div><img src=\"https://azure.com/8dd91aab-0dce-41e2-95d4-eb69e06c68fe/_apis/wit/attachments/c4e5aab1-0877-44ea-ad2d-2c614ac56984?fileName=BeforeStructure.PNG\" alt=BeforeStructure.PNG><br> </div><div><br> </div><div><img src=\"https://azure.com/8dd91aab-0dce-41e2-95d4-eb69e06c68fe/_apis/wit/attachments/af842f67-1a3d-48dc-8c8b-396a28b306ce?fileName=AfterStructure.PNG\" alt=AfterStructure.PNG><br> </div>"
CodePudding user response:
Instead of trying to extract the file name from the link in the src
attribute, you can just extract the name in the alt
attribute.
for img in soup.find_all("img"):
print(img.get("alt"))
Output -
BeforeStructure.PNG
AfterStructure.PNG
CodePudding user response:
File_name = soup.find(‘file name’).getText()
This works for 1 find, if you want to find in multiple locations then use find all