Home > database >  Why does strcpy() force this for()-loop to iterate once more than without strcpy() and cause a segme
Why does strcpy() force this for()-loop to iterate once more than without strcpy() and cause a segme

Time:10-19

I'm trying to implement my own Linux shell using C . The following code is working fine:

#include <string.h>
#include <sstream>
#include <iostream>
#include <vector>
#include <unistd.h>
#include <wait.h>
#include <stdio.h>

using namespace std;

int main()
{
    // read the input command/arguments
    cout << "myshell >> ";
    string input;
    getline(cin, input);

    // if command is 'exit', exit myshell
    if (input == "exit")
    {
        cout << exit;
        exit(0);
    }

    // init stringstream
    stringstream in_stream(input);

    // create vector to store the parts of the command
    vector<string> parts;

    // string to store the parts during one iteration
    string part;

    // read all parts of the stringstream, separated by ' ' (space)
    while (getline(in_stream, part, ' '))
    {
        parts.push_back(part);
    }

    // create c-like array for passing it to the execvp() system call
    // execvp() needs nullptr as last element, thus: size()   1
    const char* parts_arr[parts.size() 1];

    // store all elements of the parts vector in parts_arr as c-like strings
    for (int i = 0; i < parts.size(); i  )
    {
        parts_arr[i] = parts[i].c_str();
    }

    // last element must be nullptr
    parts_arr[parts.size()] = nullptr;

    // init some variables
    int pid;
    int status;

    // fork and check the returned pid
    switch (pid = fork())
    {
    case -1:
        cerr << "fork() encountered an error." << endl;
        return 1;

    case 0: // child process
        if (execvp(parts_arr[0], const_cast<char *const *>(parts_arr)) == -1)
        {
            cerr << "Child: execvp() encountered an error." << endl;
            return 1;
        }

    default: // parent process
        if (waitpid(pid, &status,0) != pid)
        {
            cerr << "Parent: wait() encountered an error." << endl;
            return 1;
        }
        cerr << "Parent: child " << pid << " ended with status " << WEXITSTATUS(status) << endl; // least significant bits transport additional information
        return 0;
    }
}

Output:

$ ./myshell
myshell >> ls -l
total 92
-rwxrwxr-x 1 ubuntu ubuntu 41576 Oct 17 13:24 a.out
-rwxrwxr-x 1 ubuntu ubuntu 41824 Oct 18 22:19 myshell
-rw-rw-r-- 1 ubuntu ubuntu  2140 Oct 18 22:19 myshell.cpp
Parent: child 10538 ended with status 0

As one might think (me included): the const_cast in the execvp()-call just does not look right, so I was trying to clean-up and rewrite my code to eliminate the need for this cast.

Doing this, I encountered a problem modifying the creation of the c-like string array for the execvp() system call leading to a Segmentation fault (core dumped) error. The following code snippets are the only lines of code I changed:

    (...)

    // create c-like array for passing it to the execvp() system call
    // execvp() needs nullptr as last element, thus: size()   1
    char* parts_arr[parts.size() 1]; // <-- changed

    // store all elements of the parts vector in parts_arr as c-like strings
    for (int i = 0; i < parts.size(); i  )
    {
        strcpy(parts_arr[i], parts[i].c_str()); // <-- changed
    }

    (...)

    case 0: // child process
        if (execvp(parts_arr[0], parts_arr) == -1) // <-- changed
        {
            cerr << "Child: execvp() encountered an error." << endl;
            return 1;
        }

    (...)

Output:

$ ./myshell
myshell >> ls -l
Segmentation fault (core dumped)

While debugging, I found that with the original code the for-loop iterates correctly, but using the new code it keeps iterating and causes the segmentation fault due to the faulty indexing.

Why does this happen? From my understanding, the for-loop should work equally for both codes, as the changes do not affect anything related to the iteration of the loop.

Also, I am open for any suggestion on how to improve the whole command/argument parsing, maybe completely eliminating the need for the conversion from the string-vector to the c-like string array.

CodePudding user response:

char* parts_arr[parts.size() 1]; // <-- changed

The size of an array variable must be compile time constant. parts.size() 1 is not compile time constant. The program is ill-formed. That issue was already in the original version.

However, let's assume that you use non-standard language extensions that allow this. The array contains pointers. You've default initialised the array, and thus the pointers have indeterminate values.

strcpy(parts_arr[i], parts[i].c_str());

strcpy requires that the fist argument is a valid pointer to sufficiently large amount of memory to fit the entire string. The pointers that you pass into strcpy are not valid pointers to sufficiently large amount of memory to fit the entire string. Not only does that violate the pre-conditions of the function, but also reading any indeterminate value of pointer type is never allowed. The behaviour of the program is undefined.

I was trying to clean-up and rewrite my code to eliminate the need for this cast.

Introducing strcpy is counter productive for "clean-up". It's a relic from C that you should avoid using it in C .

There's a much easier way to avoid the cast. Changing the type of the array element to char* is a good start for minimal change. Only other change you need is to use std::string::data instead of std::string::c_str:

parts_arr[i] = parts[i].data();
  •  Tags:  
  • c
  • Related