Home > OS >  Regex for matching all word unless at start of sentance
Regex for matching all word unless at start of sentance

Time:05-12

I'm trying to find all words that begin with an upper case, unless they are at a start of a sentence.

so

It was in late July that that he found out. He had seen Tim

would return:

July, Tim

so far I've got

(?!<*[\\s])([A-Z][A-Za-z] )

but get "He" and "It" included.

CodePudding user response:

You can consider using a lookaround like

(?<![.?!]\s|^)[A-Z][A-Za-z] 

Note this will match words of two or more ASCII letters. If one-letter words are to be found, too, replace at the end with a * quantifier.

If you plan to check for whole words only, add word boundaries, \b(?<![.?!]\s|^)[A-Z][A-Za-z]*\b

The (?<![.?!]\s|^) is a negative lookbehind that matches a location that is not immediately prececed with a . / ? / ! and a whitespace, or start of string location.

  • Related