Home > Software design >  pattern matching in Python with regex problem
pattern matching in Python with regex problem

Time:12-04

I am trying to learn pattern matching with regex, the course is through coursera and hasn't been updated since python 3 came out so the instructors code is not working correctly.

Here's what I have so far:

# example Wiki data
wiki= """There are several Buddhist universities in the United States. Some of these have existed for decades and are accredited. Others are relatively new and are either in the process of being accredited or else have no formal accreditation. The list includes: 
• Dhammakaya Open University – located in Azusa, California, 
• Dharmakirti College – located in Tucson, Arizona 
• Dharma Realm Buddhist University – located in Ukiah, California 
• Ewam Buddhist Institute – located in Arlee, Montana
• Naropa University - located in Boulder, Colorado 
• Institute of Buddhist Studies – located in Berkeley, California
• Maitripa College – located in Portland, Oregon
• Soka University of America – located in Aliso Viejo, California
• University of the West – located in Rosemead, California 
• Won Institute of Graduate Studies – located in Glenside, Pennsylvania"""




pattern=re.compile(
    r'(?P<title>.*)' # the university title
    r'(-\ located\ in\ )' #an indicator of the location
    r'(?P<city>\w*)' # city the university is in
    r'(,\ )' #seperator for the state
    r'(?P<state>\w.*)') #the state the city is in)


for item in re.finditer(pattern, wiki, re.VERBOSE):
    print(item.groupdict())

Output:

Traceback (most recent call last):
  File "/Users/r..., line 194, in <module>
    for item in re.finditer(pattern, wiki, re.VERBOSE):
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/re/__init__.py", line 223, in finditer
    return _compile(pattern, flags).finditer(string)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/re/__init__.py", line 282, in _compile
    raise ValueError(
ValueError: cannot process flags argument with a compiled pattern

I only want a dictionary with the university name, the city and the state. If I run it without re.VERBOSE, only one school shows up and none of the rest are there. I am somewhat new to python and don't know what to do about these errors

CodePudding user response:

In fact, for current versions of Python, you do not need to add re.VERBOSE at all. If you do

for item in re.finditer(pattern, wiki):                                                                 
    print(item.groupdict())

the program will print

{'title': '• Naropa University ', 'city': 'Boulder', 'state': 'Colorado '}

using Python 3.10.

By the way, the program only outputs one school because the other schools use a long hyphen instead or a short one, -. Making all schools use the same, and changing your pattern accordingly, should give you the whole list.

CodePudding user response:

Thanks to JustLearning, my problem is solved. Here is the code I ended up using. I can't believe it was a long hyphen instead of a short one. And now I know I dont need to use the re.VERBOSE. Thank you again

pattern =re.compile( r'(?P.)' r'(-\ located\ in\ )' r'(?P.)' r'(,\ )' r'(?P.*)')

  • Related