Home > Net >  Show duplicate entries removed by a set
Show duplicate entries removed by a set

Time:02-14

I expect the below program to print the duplicate entries, which were removed by the Set. But it always prints a empty array. What could be wrong?


public class HelloWorld{
    public static void main(String[] args){
        String str="HelloWorld";
        String[] strChar=str.split("");
        
        //Remove the duplicates
        Set<String> strSet=new LinkedHashSet<String>(Arrays.asList(strChar));
        
        List<String> strList=new ArrayList<String>(Arrays.asList(strChar));
        System.out.println(strList);   
        System.out.println(strSet);  

        //Show the duplicate entries that were removed by the set
        strList.removeAll(strSet);
 
        System.out.println(strList);               
    }
}```

CodePudding user response:

strList.removeAll(strSet);

This is your error line. In this line the list will remove any elements that match with the provided elements that you pass as parameter (strSet). They will be removed not only once but more than once if some element matches.

The following however will work because remove method on list, removes only the first occurance and not all matching elements of list!

public class Main
{
    public static void main(String[] args) {
        
        
        String str="HelloWorld";
        String[] strChar=str.split("");
        List<String> strList=new ArrayList<String>(Arrays.asList(strChar));
        
        //Remove the duplicates
        Set<String> strSet=new LinkedHashSet<String>(Arrays.asList(strChar));
        
        strSet.forEach(element -> {
            if (strList.contains(element)){
                strList.remove(element);
             }
        });
        
        
        System.out.println(strList);
    }
}

enter image description here

CodePudding user response:

The strSet contains ['H', 'e', 'l', 'o', 'W', 'r', 'd'] that corresponding all of items in the strList (once of each item) so when you call strList.removeAll(strSet); all of items in the list will be removed. You can write the following code to find duplicate characters.

String str="HelloWorld";
String[] split = str.split("");
Map<String, Integer> charRepeat = new HashMap<>();
for (String c : split) {
  if (charRepeat.containsKey(c)) {
    charRepeat.put(c, charRepeat.get(c)   1);
  } else {
    charRepeat.put(c, 1);
  }
}
List<String> duplicationChars = new ArrayList<>();
for (Map.Entry<String, Integer> entry : charRepeat.entrySet()) {
  if (entry.getValue() > 1) {
    duplicationChars.add(entry.getKey());
  }
}
System.out.println(duplicationChars);

CodePudding user response:

Output of the provided code:

$ javac HelloWorld.java && java HelloWorld
[H, e, l, l, o, W, o, r, l, d]
[H, e, l, o, W, r, d]
[]

The line:

System.out.println(strList); 

is displaying empty set because of all the entries are removed because of the below statement

strList.removeAll(strSet);#serSet is having [H, e, l, o, W, r, d] removeAll will make strList empty 

The output is correct.

Solution

Below program displays the duplicate values which are removed.Here we are using map to map each character to the occurance of character if. If it occurs more than one time it means it was removed .

import java.util.*;

public class HelloWorld{
    public static void main(String[] args){
        String str="HelloWorld";
        String[] strChar=str.split("");
        
        //Remove the duplicates
        Set<String> strSet=new LinkedHashSet<String>(Arrays.asList(strChar));
        
        List<String> strList=new ArrayList<String>(Arrays.asList(strChar));
        System.out.println(strList);   
        System.out.println(strSet);  

        //Show the duplicate entries that were removed by the set
        strList.removeAll(strSet);

    //Using map to remember the occurence of characters
    Map<Character,Integer> map = new LinkedHashMap<>();
    for(char ch : str.toCharArray())
        if(map.containsKey(ch))
            map.put(ch, map.get(ch) 1);
        else map.put(ch, 1);
    var out = System.out;   
    //The duplicate entries which were remove are below
    for(var entry : map.entrySet())
        if(entry.getValue() != 1)
            out.print(entry.getKey() " ");
    out.println();
 
        //System.out.println(strList);               
    }
}

Output:

$ javac HelloWorld.java && java HelloWorld
[H, e, l, l, o, W, o, r, l, d]
[H, e, l, o, W, r, d]
l o 

l and o are removed because they occur multiple times

CodePudding user response:

As others mentioned, error comes from removeAll as it removes all occurences.

Method add in Sets returns true if element was added, otherwise false - for a set that means element already exists in it. I think in your case it's more intuitive to use this method. I'll also use Character for elements in collections, because string with length 1 is basically equivalent to a char.

public class Duplicates {

    public static void main(String[] args) {
        String str = "HelloWorld";
        //Hold unique elements
        Set<Character> uniqueLetters = new LinkedHashSet<>();
        //Hold duplicates
        List<Character> duplicates = new ArrayList<>();
        for (int charIndex = 0; charIndex < str.length(); charIndex  ) {
            char letter = str.charAt(charIndex);
            boolean isAdded = uniqueLetters.add(letter);
            if (!isAdded) {
                //if isAdded is false, letter already exists in set
                duplicates.add(letter);
            }
        }
        System.out.println(uniqueLetters);
        System.out.println(duplicates);
    }
}

This will print:

unique letters - [H, e, l, o, W, r, d]
duplicated letters - [l, o, l]

l exists twice, because it existed 3 times(and consequently was a duplicate twice). If you need it to appear only once in duplicates, just change List to a Set.

CodePudding user response:

Because sets are more efficient for removal I would do it like this.

String str="HelloWorld";
String[] strChar=str.toLowerCase().split("");
Set<String> dups = new HashSet<>();
Set<String> nonDups = new HashSet<>();
for (String v : strChar) {
    nonDups.add(v);
    if(!dups.add(v)) {
        nonDups.remove(v);
    }
}
dups.removeAll(nonDups);
System.out.println(dups);

prints

[l, o]

If would like to get a count of the duplicates, you can do it like this.

  • create a Map<Character, Integer> to count occurrence of each character
  • iterate over the characters and if the value is null, set to 1, otherwise, add 1 to existing value.
  • then simply iterate the entrySet and print those whose count is > 1
String str="HelloWorld";
Map<Character, Integer> counts = new HashMap<>();
for (char ch : str.toCharArray()) {
    counts.compute(ch, (k,v)->v ==null ? 1 : v 1);
}
counts.entrySet().forEach(e-> {
    if (e.getValue() > 1) {
        System.out.println(e);
    }
});

prints

l=3
o=2
  •  Tags:  
  • java
  • Related