For ex I have this sentence
Hello. Hello.bye. 125. 000. 182.000. U. S. A.
I want it to split it in a list of String and I am using regex "\.\s"
for it and it looks like this below.
Hello
Hello.bye
125
000
182.000
U
S
A
But I want to add a condition where if there is a digit or a capital alphabet before the \.\s
it should not split from there. For ex in the above case it looks like this below:
Hello
Hello.bye
125. 000
182.000
U. S. A.
I have tried using a positive look behind conditional but it's not working. Here is what I am doing
(?(?<!([A-Z]|[0-9]))(\.\s))
CodePudding user response:
You may use this regex for matching the parts you want to match:
\d \.(?:\s*\d )?|[A-Z] (?:\.\s*[A-Z]) |\w[\w.] ?(?=\.\s)
RegEx Breakup:
\d \.(?:\s*\d )?
: Match digits separated with dot and 0 whitespaces|
: OR[A-Z] (?:\.\s*[A-Z])
: Match uppercase strings separated with dot and 0 whitespaces|
: OR\w[\w.] ?(?=\.\s)
: Match any other test that start with a word char and must have dot and whitespace as delimiter.
For splitting following regex would work but still a match should be preferred:
(?<![A-Z]|[1-9](?=\.\s\d))\.\s