I have a text
and I need to match all text parts except given words with regexp
For example if text is 'Something went wrong and I could not do anything'
and given words are 'and'
and 'or'
then the result must be ['Something went wrong', 'I could', 'do anything']
Please don't advise me to use string.split()
or string.replace()
and etc. I know a several ways how I can do this with build-in methods. I'm wonder if there a regex which can do this, when I will execute text.math(/regexp/g)
Please note that the regular expression must work at least in Chrome, Firefox and Safari versions not lower than the current one by 3! At the moment of asking this question the actual versions are 100.0, 98.0.2 and 15.3 respectively. For example you can not use lookbehind feature in Safari
Please, before answering my question, go to https://regexr.com/ and check your answer!. Your regular expression should highlight all parts of a sentence, including spaces, except for the given words
Before asking this question I tried to do my own search but this links didn't help me. I also tried non accepted answers:
Match everything except for specified strings
Regex: match everything but a specific pattern
Regex to match all words except a given list
Regex to match all words except a given list (2)
Need to find a regular expression for any word except word1 or word2
Javascript match eveything except given words
CodePudding user response:
This is beyond the capabilities of regular expressions.
Regular expressions generally are restricted to patterns that can be produced by regular grammars (which is why they are called regular).
Some regular expression tools support features that go beyond this restriction, for example (negative) lookaheads or look-behinds, but these will not give you partial matches.
For the same reason, you cannot match opening and closing HTML tags using regular expressions.
CodePudding user response:
You can do this with the boolean |
operator and capture groups.
/^(.*)( and | or )(((.*)( not )(.*))|(.*))$/i
Breaking that down:
Any characters from the start of the string up to " and " or " or " :
^(.*)( and | or )
All remaining characters:
(((.*)( not )(.*))|(.*))$
- Either two groups separated by " not " :
((.*)( not )(.*))
- Or, the remaining characters:
(.*)
- Either two groups separated by " not " :
When using String.match
the output array will get populated based on what groups were found.
matches[0]
will be the whole stringmatches[1]
will be the intro text "Something went wrong"matches[2]
will be either " and " or " or "matches[6]
will be eitherundefined
or " not "
if
matches[6]
== " not ",matches[5]
will be the text before,matches[7]
will be the text afterif
matches[6]
==undefined
,matches[3]
will be the remainder of the string
function test(input) {
let matches = input.match(/^(.*)( and | or )(((.*)( not )(.*))|(.*))$/i);
console.log(matches); return matches;
}
test('Something went wrong and I could not do anything');
test('Something went wrong and I could recover');
test('Something went wrong or whatever');