Home > Software engineering >  How can I remove duplicates from an unsorted array?
How can I remove duplicates from an unsorted array?

Time:05-12

I am working on a Java Program to generate a random ArrayList populated with integers from 1-25 inclusive. I then needed to make a copy of the array and store it in a separate location. While I was able to accomplish these portions, I am having trouble implementing a way to:

  1. Properly remove all duplicates in the array

  2. Create 2 seperate arrays. One with duplicates only and one with duplicate numbers removed.

For example:

Original array: 3 9 1 3 5 5 7 3

Array with no duplicates: 3 9 1 5 7 3

Array containing the numbers removed from original array: 3 5

Here is my program so far:

        ArrayList <Integer> OgArr = new ArrayList <Integer>();
    
    Random random = new Random();
    
    int n = 35;
    
    for (int i = 0; i < n; i  )
    {
        Integer r = 1   random.nextInt(25);
        OgArr.add(r);
    }
    
    ArrayList<Integer> copyOfArr = (ArrayList<Integer>) OgArr.clone();
    
    int i = 0;
    int leng = copyOfArr.size();
    
    for (i = 0; i < leng - 1; i  ) {
        int imin = i;
        for (int j = i   1; j < leng; j  ) {
            if (copyOfArr.get(j) < copyOfArr.get(imin)) {
                imin = j;
            }
        }
        Collections.swap(copyOfArr,i,imin);      
    }
    //System.out.println(copyOfArr);
    int z = 0;
    
    int[] ogArray = new int[copyOfArr.size()];
    
    for (z = 0; z < copyOfArr.size(); z  ){
        ogArray[z] = copyOfArr.get(z);
    }
    
    int a = 0;
    int b = 0;
    
    for(a = 0; a < ogArray.length; a  ){ // PRIMARILY TROUBLED AREA
        int counter = 0;
        int temp = ogArray[a];
        for(b = 1; b < ogArray.length; b  ){
            if(temp == ogArray[b]){
                count  
            }
        }

This program above creates an arraylist, populates it with 1-25 being the params, and makes a copy and converts it to an array.

My issue comes down to comparing different indicies of the array to check for duplicates. Following that, also storing them in two seperate arrays.

Any assistance would be greatly appreciated.

CodePudding user response:

You have 2 strategies:

  1. Loop i: Loop from 0 to the end of the array.
  2. Remember the value of the i-th element, then go through all elements after it, and remove anything that is equal to i. That would be a loop-in-a-loop. Every time you remove a value, add it to the removed list.

Alternatively:

  1. Create a new blank Set<Integer>, which will track which values we've seen before.
  2. Now loop again. For each value, add the value to the set. If the set reports that nothing changed because that number was already in the set, remove from the array and add to the 'removed' array.

The downside of the second option is that it involves Set which may not be in the spirit of the exercise. But it is fundamentally more efficient (the first takes O(n^2), that is: The performance degrades at a ratio of the square of the input's size, whereas the second is O(n), that is: That one is linear).

"Remove" is also quite tricky, if you want to do that efficiently, you end up holding 2 'pointers' into the list; a write head and a read head. For every number you process, you do nothing if it needs to be deleted, or you write it and advance the 'write head'. That model is 'efficient', whereas first writing a simple 'remove a value' method and then using that is not (as 'remove a value from a int[]' takes some time; it involves moving all elements that come after the value to be removed down by one, that's costly.

Computers are fast. Unless you have array with 100,000 size, you're unlikely to notice.

This sounds like homework and often the point is to create 'the most efficient algorithm', for some definition of 'most efficient'.

CodePudding user response:

Essentially you are looking for a list of 25 random numbers, then that list with duplicates removed and then the list of duplicates.

Java has pretty good methods for all of these operations. Unless you have a reason to do it with raw arrays I'd suggest using those.

For example:

List<Integer> randomList = random.ints(35, 0, 25).toList();
List<Integer> uniqueList = randomList.stream().distinct().toList();
List<Integer> duplicates = randomList.stream()
    .filter(v -> Collections.frequency(randomList, v) > 1)
    .distinct().toList();
  • Related