Remove duplicates from array C-CodePudding

Input: int arr[] = {10, 20, 20, 30, 40, 40, 40, 50, 50}

Output: 10, 30

My code:

int removeDup(int arr[], int n)
{
    int temp;
    bool dupFound = false;

    for(int i=0;i<n;i  ){
        for(int j=i 1;j<n;j  ){
            if(arr[i] == arr[j]){
                if(!dupFound){
                    temp = arr[i];
                    dupFound = true;
                }
                else{
                    arr[i] = temp;
                }
            }
        }
    }
    //shift here
}

First of all, I don't know if this is the most efficient way of doing this.

I'm trying to find the first duplicate element, assign it to every duplicate element and shift them to the end of the array, which doesn't work because the last duplicate element cannot be compared.

I need some help with finding the last duplicate element, so I can assign temp to it.

CodePudding user response：

I do not understand the logic of your code. When you find the second element arr[j] that equals arr[i] you will assign temp to arr[i]. However, temp has been assigned arr[i] when you found the first duplicate. Essentially you do arr[i] = arr[i]. Its not clear how this is supposed to find unique elements.

You can use a map to count frequency of elements, then print those with frequency 1:

#include <unordered_map>
#include <iostream>

int main()
{
    std::unordered_map<int,size_t> freq;
    int arr[] = {10, 20, 20, 30, 40, 40, 40, 50, 50};
   
    // count frequencies 
    for (auto e : arr) {   freq[e]; }

    // print the elements e where freq[e] == 1
    for (const auto& f : freq) {
        if (f.second == 1) {
            std::cout << f.first << "\n";
        }
    }
}

Only small modifications needed to add the unique elements to a vector.

CodePudding user response：

Instead of trying to do everything at once, let us focus on correctness first:

int removeDup(int* arr, int n) {
  // Note: No i  ! This depends on whether we find a duplicate.
  for (int i = 0; i < n;) {
    int v = arr[i];
    bool dupFound = false;

    for (int j = i 1; j < n; j  ) {
      if (v == arr[j]) {
        dupFound = true;
        break;
      }
    }

    if (!dupFound) {
      i  ;
      continue;
    }

    // Copy values to the sub-array starting at position i,
    // skipping all values equal to v. 
    int write = i, skipped = 0;
    for (int j = i; j < n; j  ) {
      if (arr[j] != v) {
        arr[write] = arr[j];
        write  ;
      } else {
        skipped  ;
      }
    }

    // The previous loop duplicated some non-v elements.
    // We decrease n to make sure these duplicates are not
    // considered in the output
    n -= skipped;
  }

  return n;
}

CodePudding user response：

Let's start with logistics (so to speak). An array always contains a fixed number of items. There's simply no way to start with an array of 5 items, and turn it into an array of 2 items. Simply can't be done.

So, as a starting point, you need to either return something like an std::vector that keeps track of its size along with the data it contains, or else you're going to need to track the size, and return something to indicate how many elements in the array are valid after the processing.

Probably the simplest way to do things would be to use something like an std::unoredered_map to count the items, then walk through the map, and insert an item in the output if (and only if) its count is 1.

std::unordered_map<int, std::size_t> counts;
for (int i=0;i<n; i  )
      counts[arr[i]];

std::vector<int> output;
for (auto item : counts)
    if (item.second == 1)
        output.push_back(item.first);
return output;

If you want to modify the data in place, I'd start by sorting the input data. Then you'll start with two indices: one for your "input" position, and one for your "output" position. output starts as zero, and input as 1.

The general idea from there is pretty simple: we look at data[input] and see if it's different from both the preceding and succeeding elements. If so, we copy it to data[output], and increment the output position.

Since this tries to look at both the preceding and succeeding elements, we have to include special cases for the beginning and end of the array. The first element is unique if it's different from the following, and the end is unique if it's different from the preceding. Code can look like this:

#include <algorithm>
#include <iostream>

unsigned remove_dupe(int *data, unsigned n) { 

    if (n < 2) {
        return n;
    }

    std::sort(data, data n);

    unsigned output = data[0] != data[1];

    for (unsigned input = output 1; input<n-1; input  )
        if (data[input] != data[input-1] && data[input] != data[input 1])
            data[output  ] = data[input];
    if (data[n-1] != data[n-2]) {
        data[output  ] = data[n-1];
    }

    return output;
}

template <class T, std::size_t N>
void test(T (&arr)[N]) {
    unsigned end = remove_dupe(arr, N);

    for (int i=0; i<end; i  )
        std::cout << arr[i] << "\t";
    std::cout << "\n";
}

int main() { 
    int arr0[] = {10, 20, 20, 30, 40, 40, 40, 50, 50};
    int arr1[] = { 1, 2};
    int arr2[] = { 1, 1};

    test(arr0);
    test(arr1);
    test(arr2);
}

Result:

10  30  
1   2

CodePudding user response：

Another option that might be available is to sort() the array. When this is done, all duplicate values throughout the array are now adjacent. You simply compare element [n] with element [n 1] to see if they are the same. You can now find and count all duplicates in a single linear pass through the sorted array.

Sorting is one of the most heavily-studied class of algorithms in computer science, and very efficient processes can be developed which rely upon things being sorted a certain way.