How to sort vector using insertion sort?-CodePudding

I was just practicing implemention of selection sort using vectors. But it returns the same vector I gave as input.

This implementation is working fine with an array but not vectors.

#include<bits/stdc  .h>
using namespace std;
vector<int>  insertionSort(vector<int> v, int n)
{
    for(int i=1;i<=n-1;i  )
    {
        int currentEle = v[i];
        int prevEle = i-1;
        // right index where current is inserted
        while(prevEle>=0 and v[prevEle]>currentEle){
            v[prevEle  1] = v[prevEle];
            prevEle = prevEle-1;

        }
       v[prevEle 1] = currentEle;
    }
      return v;
    
}
int main()
{
    int n;
    cin>>n;
    vector<int> v(n);
    for(int i=0;i<n;i  )
    {
        cin>>v[i];
    }
    insertionSort(v,n);
    for(auto x: v)
    {
        cout<<x<<" ";
        
    }

  //OUTPUTS
    
    Input : 5 3 1 2 4
    Expected Output : 1 2 3 4 5
    My Output : 5 3 1 2 4

Is there any specific way to sort vectors using selection sort using the same algorithm?

CodePudding user response：

You are not storing the returned vector from insertionSort into anything.

You can do:

std::vector<int> result = insertionSort(v,n);

instead of:

insertionSort(v,n);

and then print result instead of v

for (auto &x : result) {
  std::cout << x << " ";
}

Some other points not related to the question but helpful.

CodePudding user response：

The solution to my problem was using vector as value or reference

vector<int>  insertionSort(vector<int> &v, int n)

In a pass by reference, the actual parameter passes to the function.In case for pass by value parameter value copies to another variable. So best way to do this using vector as reference

CodePudding user response：

You are passing the vector by value to the insertionSort function but in the main function, you are not collecting the return value. You are printing the same input vector v in the end and hence, you are getting the same output.

So your main function should be like this.

int main()
{
    int n;
    cin>>n;
    vector<int> v(n);
    for(int i=0;i<n;i  )
    {
        cin>>v[i];
    }
    auto sortedVector = insertionSort(v,n);
    for(auto x: sortedVector)
    {
        cout<<x<<" ";
        
    }
    return 0;
}

CodePudding user response：

Let's step through your code like a debugger would. Here we'll put a breakpoint at the line where insertionSort(v,n); is about to be called. We'll assume that the user has already entered valid unsorted input integers. Arbitrarily I'll use the vector {3,2,9,5,7} which in your case doesn't matter.

We'll step into the code as your debugger will, now this will vary on your particular compiler and debugger tools, but for simplicity sake we can generalize what is happening here based on the rules of the C language.

The issue here is not happening at runtime. The issue here is happening at compile time. It is still valid code as there are no compile errors generated so you are thinking that it must be a runtime error because you are not getting the correct outputs.

Here you have to understand how the C compiler interprets function declaration/definition and function composition. When compiled it looks for the function that you declared as insertionSort() that must be defined before first use. In your case it is. You declared and defined it on line 3 of your code. Next it must determine if its return type and parameter types match.

Your call site at line 30: insertionSort(v,n);

Matching declaration/definition is found at line 3: vector<int> insertionSort(vector<int> v, int n); It returns a vector<int> but return types can be ignored in C at the callsite of the function. Next is the parameter types as it takes in a vector<int>and an int type by value.

When your source code is compiled into object code and then linked and translated into assembly. What's happening here is that this function will have its own Stack Frame where these variables are only visible and have a lifetime within the Scope of this function provided you are not using pointers or dynamic memory (the heap).

These variables will be destroyed when this function returns back to the call site. Also, passing by value will normally cause a copy of the value that is passed in.

Here is a kind of what the stack frame might look like for this function: Here I'll use pseudo assembly mnemonics in a worded algorithmic diagram. Now, this is a simplification of modern hardware due to compiler optimizations and modern hardware with multiple caches as well as vector intrinsics which can be ignored as they are beyond the scope of what is happening here.

rsp -> register stack pointer

ret -> return from procedure

rpt -> stack frame's procedure return pointer

r0 -> zero register

[r1 - rn] -> arbitrary multipurpose registers

mov -> opcode or instruction, either moves, stores or loads a value into some register by either an immediate value or by some other register or memory location.

 //Stack Frame:
 {   
     // Setup the stack and frame pointers... internal housekeeping

     // Return Type Setup: vector<int>
     rpt assigned to the first value in `vector<int>`
     // this is reserved memory specifically for return values 

     // Parameter Type Setup: vector<int>, int

     // n's initialized value stored in reg1
     mov r1, n's value

     // values to be stored {3,2,9,5,7}
     mov r2, 3
     mov r3, 2
     mov r4, 9
     mov r5, 5
     mov r6, 7

     /* Function Implementation. Not going to convert this to assembly
        However, it will use the rpt pointer to construct the new
        vector that is local to this function after the necessary 
        checks are performed... 

        for(int i=1;i<=n-1;i  )
        {
            int currentEle = v[i];
            int prevEle = i-1;
            // right index where current is inserted
            while(prevEle>=0 and v[prevEle]>currentEle){
                v[prevEle  1] = v[prevEle];
                prevEle = prevEle-1;
            }
            v[prevEle 1] = currentEle;
        }
      */

     // Make sure that shared registers are reassigned or their original values
     // are restored before returning from procedure.
     // More internal housekeeping...

     // Assuming the algorithm is correct and the local registers were assigned
     // to the designated memory pointed to by rpt
     // Then prt would look something like:
     // {rpt[0] = r3, rpt[1] = r2, rpt[2] = r5, rpt[3] = r6, rpt[4] = r1}
     ret rpt
 }

This is a pseudo example of what your stack frame would look like in assembly sort of. Now, ret is a like a pointer in C/C but in assembly it's a special register that holds a memory address to another register or register segments within the CPU's register files or may even point to cache in modern systems...

Now, within your C code we have reached the end of this function, and it goes out of scope. All local variables are destroyed as that memory is given back to the system and or operating system...

Within the calling of this function you never assign it to any variable from its return value. You passed v into this function by value, and copies are made to the stack. You created a temporary vector internal to this function and performed the sorting algorithm. You return from this function and never use its return value.

In your next line section of C code You are using a ranged based for loop and traversing through main's variable v which was assigned the values {3,2,9,5,7} from the user input.

You passed this into your function by value which is copied information into temporaries. And main's v variable which lives within its own stack frame is never modified. The temporary return variable within insertionSort()' stack frame is the one that was changed.

This isn't a compile time nor a runtime error. This is just a bad implementation design of your algorithm by not fully understanding the specifications of the language and how the compiler treats it.

To fix this function you can drop the return type and pass by reference. The signature will look like this:

void insertionSort(vector<int>& v, int n);

Here a reference to the variable is passed into the function. This time around when your compile sees this and generates the object code later to be translated into assembly. It doesn't make a copy of these values, it uses the same registers or cache lines directly.

In this case, the object being passed into this function can be modified after any computations have been performed on it.

Here's a simple example of a basic integer add function replicating exactly what you have done:

// Pass By Value:
int add(int a, int b) {
    a = a b;
    return a;
}

int main() {
    int a = 3;
    int b = 5;
    add(a, b);
    std::cout << a;
    return 0;
}

Here you're expecting to see 8 printed, but in fact a from main is never modified and 3 will be printed to your console and you never used the return value from the function.

Let's check the pass by reference version:

void add(int& a, int b) {
    a = a b;
    return a;
}

int main() {
    int a = 3;
    int b = 5;
    add(a, b);
    std::cout << a;
    return 0;
}

In this version because a reference to main's a is used directly within the stack frame of add() main's a will be modified. This time when you print to the console you will see an 8.

I hope this helps to clear up the different ways of passing parameter types into functions within C . You can pass by value, by reference and by pointer and some of them have const versions which is a out of the scope of this topic. Also, do not rely on your parameter or function argument ordering when implementing your algorithms. There is no specification or single use convention or guarantee in how a specific compiler will set up a function's arguments or parameter list when generating the necessary assembly. There are various calling conventions and you would have to know which compiler you are using and what calling convention is being used.

I hope this helps you to understand what is happening with your code, what's going on under the hood at different levels of abstraction and why you are getting the results that you are getting and to help you understand what type of error this is.

Everyone else seems to just say, "passing by value and not reference" assuming you know what they are and mean... I'm treating you as if this was your first day at looking at code or a new language.