stackers!
I have been trying to figure this out for some time but no luck.
(.*?(?:\.|\?|!))(?: |$)
the above pattern is capturing and breaking all sentences in a paragraph with ending punctuation.
example
- Today is the greatest. You are the greatest.
The match comes back with three
Match {
1.
Today is the greatest.
You are the greatest.
}
However I am trying to get it to not break when there is a number with a period and would like to see the following match instead:
Match {
1.Today is the greatest.
You are the greatest.
}
Thanks for your help in advance
CodePudding user response:
Use
.*?[.?!](?=(?<!\d\.)\s |\s*$)
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
--------------------------------------------------------------------------------
[.?!] any character of: '.', '?', '!'
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
\d digits (0-9)
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ") (1
or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0
or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
$ before an optional \n, and the end of
the string
--------------------------------------------------------------------------------
) end of look-ahead