Is there any python string(like .findall ,.find etc) where directly can find what is wanted? For example if we want in an html file all the hyperlinks where is included the 'www' to give something like:
html.findall(www)
Of course the syntax is not right but one simple string without many arguments could help
CodePudding user response:
Here is a simple example that uses re
module to find all websites that start with www.
:
import re
string = """<a href="stackoverflow.com>Stack Overflow"</a>
<a href="github.com">Github</a>
<a href="www.google.com">Google</a>
<a href="www.madeupwebsite.com">Made Up</a>
<a href="pypi.org">PyPi</a>
"""
print(re.findall("(?!\")www.*(?=\")", string)) # Find all non-overlapping matches