Home > Software engineering >  Modify the regular expression to include additional string
Modify the regular expression to include additional string

Time:12-12

I have a function

def regularize_string(my_string):
    # Keep only the relevant characters in the my_string
    return re.sub('[^A-F0-9x\\|&!()]', '', str(my_string))

This function is being used in my viewer so that only expressions like

Eg: (35 & 74) & !(36 | 37) are included and remaining will throw an error message:

Now I also want to include "all" so that my viewer would not throw an error if "all" is typed.

How can I modify this expression such that It can also include "all" in the regular expression?

CodePudding user response:

Regular expressions are capable of accepting entire words.

Square brackets are meant for character class, and you're actually trying to match the chars into your square brackets.

You need to change the 3rd line like this:

return re.sub('[^A-F0-9x\\|&!()]|all', '', str(my_string))

Remember to change also the previous comment line to:

# Keep only the relevant characters in the my_string   the word "all"

Note: Non-capture groups tell the engine that it doesn't need to store the match, while the other one (capturing group does). For small stuff, either are good, for 'heavy duty' stuff, you might want to see first if you need the match or not. If you don't, better use the non-capture group to allocate more memory for calculation instead of storing something you will never need to use.

CodePudding user response:

If i understand correctly you want to also ignore "all". This code would do that:

import regex as re

def regularize_string(my_string):
    return re.sub('[^A-F0-9x\\|&!()](?=.*all)|(?<=all.*)[^A-F0-9x\\|&!()]', '', str(my_string))

print(regularize_string("qopvg239'9^'2345koasdpallvm4kl"))

returns:

23992345all4

Keep in mind for this to work you will need to install the module "regex". The default library "re" doesn't support ".*" inside of a positive look behind. (?<=all.*)

  • Related