Home > Blockchain >  Java regex for preserving currency symbols along with comma and dot if they are surrounded by number
Java regex for preserving currency symbols along with comma and dot if they are surrounded by number

Time:09-27

This is my input string

String inputString = "fff.fre def $fff$ £45112,662 $0.33445533 abc,def 12,34"

I tried below regex to split

String[] tokens = inputString.split("(?![$£](?=(\\d)*[.,]?(\\d)*))[\\p{Punct}\\s]");

but it is not preserving comma and dot if they are surrounded by numbers. Basically,I don't want to split by comma and dot if they are part of price value

Output I get is

token==>fff
token==>fre
token==>def
token==>$fff$
token==>£45112
token==>662
token==>$0
token==>33445533
token==>abc
token==>def
token==>12
token==>34

Expected output

token==>fff
token==>fre
token==>def
token==>$fff$
token==>£45112.662
token==>$0.33445533
token==>abc
token==>def
token==>12
token==>34

CodePudding user response:

Instead of split, you may use this simpler regex to get all the desired matches:

[$£]\w [$£]?|[^\p{Punct}\h] 

RegEx Demo

RegEx Breakup:

  • [$£]: Match $ or £
  • \w : Match 1 word chars
  • [$£]?: Match optional $ or £
  • |: OR
  • [^\p{Punct}\h] : Match 1 of any char that are not whitespace or punctuation

Code:

final String regex = "[$£]\\w [$£]?|[^\\p{Punct}\\h] ";
final String string = "fff.fre def $fff$ £45112,662 $0.33445533 abc,def 12,34";
        
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
        
while (matcher.find()) {
   System.out.println("token==>"   matcher.group());
}
  • Related