Home > Software engineering >  Finding Duplicates in an array using a Set in C
Finding Duplicates in an array using a Set in C

Time:10-21

I am currently practicing for coding interviews and am working on a function that takes in an array and the size of that array and prints out which numbers in it are duplicates. I have gotten this to work using the two for loop method but want an optimized solution using sets. Snippet of the code I have is below,

#include <iostream>
#include <set>
using namespace std;

void FindDuplicate(int integers[], int n){
  set<int>setInt;

  for(int i = 0; i < n; i  ){

    //if this num is not in the set then it is not a duplicate
    
    if(setInt.find(integers[i]) != setInt.end()){
      setInt.insert({integers[i]});
    }
    else
      cout << integers[i] << " is a duplicate";
  }
}

int main() {
  
int integers [] = {1,2,2,3,3};

int n = sizeof(integers)/sizeof(integers[0]);

FindDuplicate(integers, n);
}

Any helpful advice is appreciated, thanks

CodePudding user response:

I think your comparison is not needed, insert do it for you: https://en.cppreference.com/w/cpp/container/set/insert

Returns a pair consisting of an iterator to the inserted element (or to the element that prevented the insertion) and a bool value set to true if the insertion took place.

Just insert element and check what insert function returns (false on second element of pair in case of duplication) :)

CodePudding user response:

here's a quick solution use an array of size N (try a big number) and whenever a number is added into the other array on the large array add 1 to the position like: array_of_repeated[user_input] ; so if the program asks how many times (for example) number 234 was repeated? std::cout<<array_of_repeated[requested_number]<<std::endl; so in this way you wont spend time looking for a number inside the other list

CodePudding user response:

my solution proposal is :

  • count the frequencies of each element (algo for frequencies are explained here frequency
  • display elements with frequency more than 1 (it is a duplicate)

In each operation, you do not use imbricated loops.

#include <iostream>
#include <unordered_map>

using namespace std;

void FindDuplicate(int integers[], int n)
{
    unordered_map<int, int> mp;

    // Traverse through array elements and
    // count frequencies
    for (int i = 0; i < n; i  )
    {
        mp[integers[i]]  ;
    }

    cout << "The repeating elements are : " << endl;
    for (int i = 0; i < n; i  ) {
      if (mp[integers[i]] > 1)
      {
          cout << integers[i] << endl;
          mp[integers[i]] = -1;
      }
    }
}

int main()
{

    int integers [] = {1,1,0,0,2,2,3,3,3,6,7,7,8};

    int n = sizeof(integers)/sizeof(integers[0]);

    FindDuplicate(integers, n);
}

CodePudding user response:

This is my feedback:

#include <iostream>
#include <vector>
#include <set>

// dont' do this, in big projects it's not done (nameclash issues)
// using namespace std;

// pass vector by const reference you're not supposed to change the input
// the reference will prevent data from being copied.
// naming is important, do you want to find one duplicate or more...
// renamed to FindDuplicates because you want them all

void FindDuplicates(const std::vector<int>& input) 
{
    std::set<int> my_set;     
    // don't use index based for loops if you don't have to
    // range based for loops are more safe
    // const auto is more refactorable then const int
    for (const auto value : input)
    {
        //if (!my_set.contains(value)) C   20 syntax 
        if (my_set.find(value) == my_set.end())
        {
            my_set.insert(value);
        }
        else
        {
            std::cout << "Array has a duplicate value : " << value << "\n";
        }
    }
}

int main() 
{
    // int integers[] = { 1,2,2,3,3 }; avoid "C" style arrays they're a **** to pass around safely
    // int n = sizeof(integers) / sizeof(integers[0]); std::vector (or std::array) have size() methods 

    std::vector input{ 1,2,2,3,3 };
    FindDuplicates(input);
}

CodePudding user response:

You do not need to use a set. To find the duplicates:

  1. Sort array with numbers
  2. Iterate over the array (start with second element) and copy elements where previous element equals
    current element into a new vector "duplicates"
  3. (Optional) use unique on the "duplicates" if you like to know which number is a duplicate and do not care if it is 2, 3 or 4 times in the numbers array

Example Implementation:

#include <algorithm>
#include <iostream>
#include <vector>

void
printVector (std::vector<int> const &numbers)
{
  for (auto const &number : numbers)
    {
      std::cout << number << ' ';
    }
  std::cout << std::endl;
}

int
main ()
{
  auto numbers = std::vector<int>{ 1, 2, 2, 42, 42, 42, 3, 3, 42, 42, 1, 2, 3, 4, 5, 6, 7, 7 };
  std::sort (numbers.begin (), numbers.end ());
  auto duplicates = std::vector<int>{};
  std::for_each (numbers.begin ()   1, numbers.end (), [prevElement = numbers.begin (), &duplicates] (int currentElement) mutable {
    if (currentElement == *prevElement)
      {
        duplicates.push_back (currentElement);
      }
    prevElement  ;
  });
  duplicates.erase (std::unique (duplicates.begin (), duplicates.end ()), duplicates.end ());
  printVector (duplicates);
}

edit: If you have no problem with using more memory and more calculations but like it more expressive:

  1. Sort numbers
  2. Create a new array with unique numbers "uniqueNumbers"
  3. Use "set_difference" to calculate (numbers-uniqueNumbers) which leads to an new array with all the duplicates
  4. (Optional) use unique on the "duplicates" if you like to know which number is a duplicate and do not care if it is 2, 3 or 4 times in the numbers array Example Implementation:
#include <algorithm>
#include <iostream>
#include <vector>

void
printVector (std::vector<int> const &numbers)
{
  for (auto const &number : numbers)
    {
      std::cout << number << ' ';
    }
  std::cout << std::endl;
}

int
main ()
{
  auto numbers = std::vector<int>{ 2, 2, 42, 42, 42, 3, 3, 42, 42, 1, 2, 3, 4, 5, 6, 7, 7 };
  std::sort (numbers.begin (), numbers.end ());
  auto uniqueNumbers = std::vector<int>{};
  std::unique_copy (numbers.begin (), numbers.end (), std::back_inserter (uniqueNumbers));
  auto duplicates = std::vector<int>{};
  std::set_difference (numbers.begin (), numbers.end (), uniqueNumbers.begin (), uniqueNumbers.end (), std::back_inserter (duplicates));
  std::cout << "duplicate elements: ";
  printVector (duplicates);
  std::cout << "unique duplicate elements: ";
  printVector ({ duplicates.begin (), std::unique (duplicates.begin (), duplicates.end ()) });
}
  •  Tags:  
  • c
  • Related