I have a string and I want to remove any other character such as (0..9!@#$%^&*()_., ...) and keep only alphabetic characters.
After looking up and doing some tests, I got 2 regexes formats:
String str = "123hello!#$% مرحبا. ok";
str = str.replaceAll("[^a-zA-Z]", "");
str = str.replaceAll("\\P{InArabic} ", "");
System.out.println(str);
This should return "hello مرحبا ok".
But of course, this will return an empty string because we're removing any non-Latin characters in the first regex then we remove any non-Arabic characters in the second regex.
My question is, how can I merge these 2 regexes in one to keep only Arabic and English characters only.
CodePudding user response:
Use lowercase p since negation is handled with ^ and no quantifier is needed (but wouldn't hurt) since using replaceAll
:
String str = "123hello!#$% مرحبا. ok";
str = str.replaceAll("[^a-zA-Z \\p{InArabic}]", "");
System.out.println(str);
Prints:
hello مرحبا ok
Note based on your expected results you want spaces included so a space is in the character list.