Home > Enterprise >  can someone suggest a better algorithm than this to check if there is at least one duplicate value i
can someone suggest a better algorithm than this to check if there is at least one duplicate value i

Time:01-19

an unsorted integer array nums, and it's size numsSize is given as arguments of function containsDuplicate and we have to return a boolean value true if at least one duplicate value is there otherwise false. for this task I chose to check if every element, and the elements after that are equal or not until last second element is reached, if equal I will be returning true otherwise false.

bool containsDuplicate(int* nums, int numsSize){
    for(int i =0 ;i< numsSize-1;i  )
    {
        for(int j = i 1;j < numsSize; j  )
        {
            if(nums[i] == nums[j])
            {
                return true;
            }
        }
    }
    return false;
}

To minimize run time, I've written return value just when the duplicates are found, but still my code is not performing well on large size arrays, I'm expecting an algorithm which has a time complexity O(n) if possible. And is there anyway we can skip the values which are duplicates of previously looked values? I've seen all other solutions, but I couldn't find a better solution in C.

CodePudding user response:

Your algorithm is O(n^2). But if you sort first, which can be done in less than O(n^2), then determining if there is a duplicate in the array is O(n).

You could maintain a lookup table to determine if each value has been previously seen, which would run in O(n) time, but unless the potential range of values stored in the array are relatively small, this has prohibitive memory usage.

For instance, if you know the values in the array will range from 0-127.

int contains_dupes(int *arr, size_t n) {
   char seen[128] = {0};
   for (size_t i = 0; i < n; i  ) {
       if (seen[arr[i]]) return 0;
       seen[arr[i]] = 1;
   }
   return 1;
}

But if we assume int is 4 bytes, and the values in the array can be any int, and we use char for our lookup table, then your lookup table would have to be 4GB in size.

CodePudding user response:

O(n) time, O(n) space: use a set or map. Parse your array, checking each element in turn for membership in your set or map. If it's present then you've found a duplicate; if not, then add it.

If O(n) space is too expensive, you can get away with far less by doing a first pass using a cuckoo hash, which is a space efficient data structure that guarantees no false negatives, but can have false positives. Use the same approach as above but with the cuckoo hash instead of a set or map. Any duplicates you detect may be false positives, so will need to be checked.

Then, parse the array a second time, using the approach described in the first paragraph, but skip past anything that isn't in your set of candidates.

This is still O(n) time.

https://en.wikipedia.org/wiki/Cuckoo_hashing

  • Related