Home > Mobile >  Java remove duplicate characters(included) from a given String
Java remove duplicate characters(included) from a given String

Time:06-26

Is there any efficient way (that not includes "contains" or "indexOf" methods) to remove duplicate characters (included) from a given String.

For example:

input: "abcdcb" output: "ad"

input: "abracadabra" output: "cd"

I have tried using RegEx but something got wrong:

public static String function(String str) {
    return str.replaceAll("([a-z] )\\1", "");
}

Edit: Longest, non efficient solution:

public static String function(String s) {
    String result = "";

    for (int i = 0; i < s.length(); i  )
        if(s.length() - s.replace(""   s.charAt(i), "").length() == 1)
            result  = ""   s.charAt(i);
    
    return result;
}

CodePudding user response:

A single regular expression cannot do this.

As Pshemo suggested, create a Map that counts the character occurrences:

Map<Integer, Long> codePointCounts = s.codePoints().boxed().collect(
    Collectors.groupingBy(
        c -> c, LinkedHashMap::new, Collectors.counting()));

(Using LinkedHashMap will preserve the order of the code points when they are placed in the Map.)

Then discard all the Map entries which don’t have a count of one:

codePointCounts.values().retainAll(Set.of(1L));

Removing a Map value will remove its corresponding key, so any remaining Map keys are unique codepoints from the original String. You can create a String from an array of ints:

int[] codePoints = codePointCounts.keySet().stream()
    .mapToInt(Integer::intValue).toArray();

String uniqueChars = new String(codePoints, 0, codePoints.length);
  • Related