Home > Software engineering >  Efficient use of regex with dictionary
Efficient use of regex with dictionary

Time:10-24

I have some text files with tags in the form ${key} and a dictionary for the keys. The tags shall be replaced with the text from the dictionary selected by the key.

I found a way with a regular expression for tags, lookup the key in the dictionary and rebuild the string using the corresponding dictionary value. This works but looks a bit clumsy. And I think it could be more efficient with precompiled rex and avoiding the two slices in each iteration.

How can this be implemented more readable using Python functions instead of hand-crafty stuff?

# minimal but complete example code
import re

mydic = { 'a':'alpha', 'b':'gamma' }
s = "some text about ${a} and ${b} but not ${foo}"

while True:
    sr = re.search('\${(. ?)}',s)

    if None == sr:  # could the search result be evaluated in the while clause?
        break

    key = sr.group(1)
    a,b = sr.span()
    if key in mydic:
        s = s[:a]   mydic[key]   s[b:]
    else:
        # found unkown key in ${}
        s = s[:a]   s[b:]

# output the result
s

The expected result is "some text about alpha and gamma but not ".

CodePudding user response:

In case your text contains no other instances of ${ except at the beginning of keys and there are no instances of {foo} that are not meant to be keys, you can take advantage of the builtin str.format_map function:

from collections import defaultdict

d = defaultdict(str)
d.update(mydic)
s = s.replace('${', '{').format_map(d)

If you want to use a regex, you can use re.sub:

import re

s = re.sub(r'\${(. ?)}', lambda m: mydic.get(m.group(1), ''), s)
  • Related