Home > Blockchain >  Regex exact match
Regex exact match

Time:12-19

I have the following sentence: "The size of the lunch box is around 1.5l or 1500ml"

How can I change this to: "The size of the lunch box is around 1.5 liter or 1500 milliliter"

In some cases, the value might also be present as "1.5 l or 1500 ml" with a space.

I am not be able to capture the "l" or "ml" when I am trying to build a function, or it is giving me an escape error.

I tried:

def stnd(text):

text = re.sub('^l%',' liter', text) 
text = re.sub('^ml%',' milliliter', text) 

text = re.sub('^\d \.\d \s*l$','^\d \.\d \s*liter$', text) 
text = re.sub('^^\d \.\d \s*ml$%','^\d \.\d \s*milliliter$', text) 

return text

CodePudding user response:

You could use a dict to list all the units as the key, and use a pattern to find a digit followed by either ml or l which you could then use as the key for the dict to get the value.

(?<=\d)m?l\b

The pattern matches:

  • (?<=\d) Positive lookbehind, assert a digit to the left
  • m?l\b Match an optional m followed by b and a word boundary

See a regex demo.

Example

s = "The size of the lunch box is around 1.5l or 1500ml"
pattern = r"(?<=\d)m?l\b"
dct = {
    "ml": "milliliter",
    "l": "liter"
}
result = re.sub(pattern, lambda x: " "   dct[x.group()] if x.group() in dct else x, s)
print(result)

Output

The size of the lunch box is around 1.5 liter or 1500 milliliter

CodePudding user response:

We can handle this replacement using a dictionary of lookup values and replacements.

d = {"l": "liter", "ml": "milliliter"}
inp = "The size of the lunch box is around 1.5l or 1500ml"
output = re.sub(r'(\d (?:\.\d )?)\s*(ml|l)', lambda m: m.group(1)   " "   d[m.group(2)], inp)
print(output)

# The size of the lunch box is around 1.5 liter or 1500 milliliter

def stnd(text):
    return re.sub(r'(\d (?:\.\d )?)\s*(m?l)', lambda m: m.group(1)   " "   d[m.group(2)], text)
  • Related