Home > Blockchain >  Split camel case text with number groups
Split camel case text with number groups

Time:12-04

I have strings containing camel case text and numbers and would like to split it.

E.g. the string "abcDefGhi345J6" should be split into

["abc", "Def", "Ghi", "345", "J", "6"]

My best effort is

"abcDefGhi345J6".split("(?=\\p{Lu})|(?!\\p{Lu})(?=\\d )")

which gives me

["abc", "Def", "Ghi", "3", "4", "5", "J", "6"]

PS: Dupe marked answers are NOT giving expected output as those are are not Unicode agnostic.

CodePudding user response:

You may use this regex for splitting:

(?=\p{Lu})|(?<!\d)(?=\d)

RegEx Demo

For Java code:

String[] arr = string.split("(?=\\p{Lu})|(?<!\\d)(?=\\d)");

(?<!\d)(?=\d) will find a position that has a digit ahead but there is no digit behind that position.

  • Related