I'm looking for a way to select a part of a string with punctuation based on a string that doesn't have punctuation.
Ex.
Oh, my goodness. This is it. Oh.
I want to select Oh, my goodness.
(note the trailing period). The string that I have to search with is:
oh my goodness
I've been looking all around for a solution to this, but I can't seem to find a good answer. Can anyone help me?
CodePudding user response:
Your question lacks some details, so here are some assumptions:
- your space separated search term is a sequence of words to find, e.g. search term
foo bar
will not findsome bar foo text
input - your search term should ignore non-word chars, for example
foo bar
will findsome foo, bar text
andsome foo: bar text
- you want to find the search term anywhere in the input
- include a trailing dot, if any (e.g. not required)
The regex can be tweaked as needed if some of the assumptions are not correct.
Code with match and replace examples:
const input = 'Oh, my goodness. This is it. Oh.';
const searchTerm = 'oh my goodness';
const regex = new RegExp('\\b' searchTerm.replace(/ /g, '\\W ') '\.?', 'i');
console.log({
match: input.match(regex),
replace: input.replace(regex, '<b>$&</b>')
});
Output:
{
"match": [
"Oh, my goodness."
],
"replace": "<b>Oh, my goodness.</b> This is it. Oh."
}
Explanation of regex construct:
'\\b'
-- word boundary (replace with'^'
if you want to search at the beginning of the input string)searchTerm.replace(/ /g, '\\W ')
-- allow any non-word chars, such as,
,:
'\.?'
-- include optional dot'i'
-- regex flag to ignore case
CodePudding user response:
You can replace all spaces to accept characters between words
const text = 'Oh, my goodness. This is it. Oh.';
const search = 'oh my goodness';
const expression = new RegExp(`${search.replace(/ /g, '.*')}[^.]*\\.*`, 'i');
const [match] = expression.exec(text);
console.log(match)
CodePudding user response:
/[^.]*\b(oh|my)\b.(?=goodness)[^.]*\./Ug
[^.]*
and[^.]*
check the start and the end of a sentence\b(oh|my)\b.
matches words oh and my in a sentence(?=goodness)
is a positive lookahead. We tell the regex: 'Search oh and my words before the word goodness'- also, we use
g
(global) andU
(Ungreedy) regex flags.
In short, the regex will match all the sentences containing mentioned words and will separate the given line into matching sentences.