I am new to Regex and also python. I am trying to parse out a link in some HTML, and ran across the solution [^"]*
For example:
import re
pattern = '''[^"]*'''
matcher = re.compile(pattern)
matches = matcher.search('https://www.yelp.com/adredir?ad_business_id=PX_xyQcEj1bnaec2oMwH2')
print(matches.group())
This pattern successfully matches the link, is it because it saying anything between the single quote in the string is a match? In what case would you use [^']* instead of [^'']* ?
CodePudding user response:
[^"]*
just says "get every character that does not start with "
till one "
has been found". *
represents "at least zero times". I recommend this tutorial and this playground.
Example Code:
import re
pattern = '''[^"]*'''
matcher = re.compile(pattern)
matches = matcher.search(
'https://www.yelp.com/adredir?ad_business_id=PX_xyQcEj1bnaec2oMwH2"asd')
print(matches.group())
This outputs:
https://www.yelp.com/adredir?ad_business_id=PX_xyQcEj1bnaec2oMwH2
CodePudding user response:
"*" - means "0 or more instances of the preceding regex token"