Home > OS >  How can I split a long string using regex (\.\s) ignoring a few cases
How can I split a long string using regex (\.\s) ignoring a few cases

Time:09-08

For ex I have this sentence

Hello. Hello.bye. 125. 000. 182.000. U. S. A. 

I want it to split it in a list of String and I am using regex "\.\s" for it and it looks like this below.

Hello
Hello.bye
125
000
182.000
U
S
A

But I want to add a condition where if there is a digit or a capital alphabet before the \.\s it should not split from there. For ex in the above case it looks like this below:

Hello
Hello.bye
125. 000
182.000
U. S. A.

I have tried using a positive look behind conditional but it's not working. Here is what I am doing

(?(?<!([A-Z]|[0-9]))(\.\s))

CodePudding user response:

You may use this regex for matching the parts you want to match:

\d \.(?:\s*\d )?|[A-Z] (?:\.\s*[A-Z]) |\w[\w.] ?(?=\.\s)

RegEx Demo

RegEx Breakup:

  • \d \.(?:\s*\d )?: Match digits separated with dot and 0 whitespaces
  • |: OR
  • [A-Z] (?:\.\s*[A-Z]) : Match uppercase strings separated with dot and 0 whitespaces
  • |: OR
  • \w[\w.] ?(?=\.\s): Match any other test that start with a word char and must have dot and whitespace as delimiter.

For splitting following regex would work but still a match should be preferred:

(?<![A-Z]|[1-9](?=\.\s\d))\.\s

RegEx Demo 2

  • Related