How would you remove a number from an unsorted vector using only .at(), .push

I'm new to C and I'm trying to write a void function that will delete a duplicate from a vector while preserving the order of the vector. I'm having trouble deleting the number from my vector using just .at(), .push_back(), .size(), and .resize(). How would I go about doing this?

This is what I have so far:

void RemoveDuplicates(std::vector<int>& vector, int vectorSize) 
{
    int i;
    int j;
    std::vector<int> tempVec; 

    for (i = 0; i < vector.size(); i  )
    {
        for (j = 1; j < vector.size(); j  )
        {
            if (vector.at(i) == vector.at(j))
            {
                tempVec.push_back(vector.at(i));  //Unduplicated Vector
            }
        }
    }
}

If I were to put "1 2 3 3" in this it returns the tempVec as "2 3 3 3 3." The expected result was just "1 2 3." How would I go about fixing this so that it deduplicates the vector using just those vector manipulators?

CodePudding user response：

Here's a simple approach, not very efficient but easy to understand

void RemoveDuplicates(std::vector<int>& vector) 
{
    std::vector<int> tempVec; 
    for (size_t i = 0; i < vector.size(); i  )
    {
        // look for vector[i] in remainder of vector
        bool found = false;
        for (size_t j = i   1; j < vector.size(); j  )
        {
            if (vector.at(i) == vector.at(j))
            {
                found = true;
                break;
            }
        }
        // if not found it's not a duplicate
        if (!found)
            tempVec.push_back(vector.at(i));
    }
    vector = tempVec;
}

CodePudding user response：

Building on your current idea, you can compare each value in vector with all the values in tempVec. If it's not found in tempVec, add it.

I'm using range based for-loops to simplify the looping:

#include <utility> // std::move

void RemoveDuplicates(std::vector<int>& vector) {
    std::vector<int> tempVec; 
    
    for(int in : vector) {       // `in` will be assigned one value at a time from vector
        bool found = false;      // set to `true` if the `in` value has already been seen
        for(int out : tempVec) { // range based for-loop again
            if(in == out) {      // oups, duplicate
                found = true;    // set to true to avoid storing it
                break;           // and abort this inner loop
            }
        }
        // only stored values not found:
        if(not found) tempVec.push_back(in);
    }
    // move assign the result to `vector`:
    vector = std::move(tempVec);
}

CodePudding user response：

For starters the second parameter of the function is redundant and not used within the function. Remove it. The function should be declared like

void RemoveDuplicates( std::vector<int> &vector );

Also you forgot to change the original vector.

And it seems you mean inequality in the if condition

if (vector.at(i) != vector.at(j))
{
    tempVec.push_back(vector.at(i));  //Unduplicated Vector
}

instead of the equality

if (vector.at(i) == vector.at(j))
{
    tempVec.push_back(vector.at(i));  //Unduplicated Vector
}

Though in any case it is an incorrect logic within the inner for loop.

Your inner for loop is always start from 1

for (j = 1; j < vector.size(); j  )

So for this source vector { 1, 2, 3, 3 } the value 2 will be pushed once on the tempVec and for each element with the value 3 there will be pushed two values equal to 3 to tempVec.

Using your approach the function can look the following way.

void RemoveDuplicates( std::vector<int> &vector ) 
{
    std::vector<int> tempVec; 

    for ( std::vector<int>::size_type i = 0; i < vector.size(); i   )
    {
        std::vector<int>::size_type j = 0;

        while ( j < tempVec.size() && vector.at( i ) != tempVec.at( j ) )
        {
              j;
        }

        if ( j == tempVec.size() )
        {
            tempVec.push_back( vector.at( i ) );
        }
    }

    std::swap( vector, tempVec );
}

Here is a demonstration program.

#include <iostream>
#include <vector>

void RemoveDuplicates( std::vector<int> &vector ) 
{
    std::vector<int> tempVec; 

    for ( std::vector<int>::size_type i = 0; i < vector.size(); i   )
    {
        std::vector<int>::size_type j = 0;

        while ( j < tempVec.size() && vector.at( i ) != tempVec.at( j ) )
        {
              j;
        }

        if ( j == tempVec.size() )
        {
            tempVec.push_back( vector.at( i ) );
        }
    }

    std::swap( vector, tempVec );
}

int main() 
{
    std::vector<int> v = { 1, 2, 3, 3 };

    std::cout << v.size() << ": ";
    for ( const auto &item : v )
    {
        std::cout << item << ' ';
    }
    std::cout << '\n';

    RemoveDuplicates( v );

    std::cout << v.size() << ": ";
    for ( const auto &item : v )
    {
        std::cout << item << ' ';
    }
    std::cout << '\n';
}

The program output is

4: 1 2 3 3 
3: 1 2 3

CodePudding user response：

Resizing an array (including vectors) in-place and shifting items over by 1 each time a duplicate is found can be expensive.

If you can use a hash table based collection (i.e. unordered_set or unordered_map) to keep track of items already seen, you can have an O(N) based algorithm.

Aside from the std::unqiue solution already suggested, it's hard to beat this. std::unique is effectively the same thing.

void RemoveDuplicates(std::vector<int>& vec)
{
    std::unordered_set<int> dupes;
    std::vector<int> vecNew;
    for (int x : vec)
    {
        if (dupes.insert(x).second)
        {
            vecNew.push_back(x);
        }
    }
    vec = std::move(vecNew);
}