Home > Enterprise >  What is the correct Regex to find a letter but NOT if it appears in a bigger pattern/word/phrase?
What is the correct Regex to find a letter but NOT if it appears in a bigger pattern/word/phrase?

Time:11-03

I am looking to use Regex to find all instances of a certain letter in a given string, but NOT if that letter appears in a larger word/phrase. For example:

For test string:

lag(a,1) 252a max(3a2) / 5*pctrange(a,10)

I want to obtain all instances of the letter 'a' excluding the letter 'a' that appears in the following three words:

lag max pctrange

i.e. I would like to use Regex to get all instances of the letter 'a' as highlighted here:

lag(a,1) 252*a max(3a*2) / 5*pctrange(a,10)

I attempted to use the following Regex but it keeps including the character after my desired letter 'a':

a[^"lag|max|pctrange"]

To provide some context, I'm in Python looking to replace these 'a' instances using the re module:

import re
string = "lag(a,1)   252*a   max(3a*2) / 5*pctrange(a,10)"
words = ["lag", "max", "pctrange"]
replace = "_"
re.sub(f"a[^\"{'|'.join(words)}\"]", replace, string)

This results in the (undesired) output:

lag(_1)   252*_  max(3_2) / 5*pctrange(_10)

I would like instead for the output to be the following:

lag(_,1)   252*_   max(3_*2) / 5*pctrange(_,10)

CodePudding user response:

I think what you are looking for are world boundaries:

The following regex matches a only if it's enclosed in two world boundaries or if it has a digit behind it:

(?<=\d)a\b|\ba\b

https://regex101.com/r/7IfinZ/1

CodePudding user response:

To prevent a from being matched if adjacent to another letter try negative lookarounds.

(?i)(?<![a-z])a(?![a-z])

See this demo at regex101 - Used the (?i) flag for caseless matching: [a-z][a-zA-Z]

  • Related