Home > Mobile >  Implementation of BST C Segmentation Fault
Implementation of BST C Segmentation Fault

Time:02-16

I have implemented binary search tree in C and for some reason I am not seeing where the segmentation fault occurs. But I do notice that when I comment out root = node in the first conditional statement in addNode the error goes away. What exactly is a segmentation fault and how does it related to pointers?

#include <iostream>
#include <iomanip>
using namespace std;

class bstNode
{
public:
    int value;
    bstNode *left;
    bstNode *right;

    bstNode(){};
    ~bstNode(){};

    bstNode(int value)
    {
        this->value = value;
        this->left = NULL;
        this->right = NULL;
    }

    bstNode(int value, bstNode *left, bstNode *right)
    {
        this->value = value;
        this->left = left;
        this->right = right;
    }

    bstNode *root;

    void addNode(int value)
    {
        if (root == NULL)
        {
            bstNode *node = new bstNode(value);
            root = node;
        }
        else
        {
            bstNode *focusNode = root;
            bstNode *parent;

            while (focusNode != NULL)
            {
                if (value > focusNode->value)
                {
                    focusNode = focusNode->right;
                    if (focusNode == NULL)
                    {
                        focusNode->right = new bstNode(value);
                    }
                }
                else
                {
                    focusNode = focusNode->left;
                    if (focusNode == NULL)
                    {
                        focusNode->left = new bstNode(value);
                    }
                }
            }
        }
    }

    static void printBST(bstNode *node)
    {
        while (node != NULL)
        {
            printBST(node->left);
            cout << node->value;
            printBST(node->right);
        }
    }
};

int main()
{
    bstNode *node = new bstNode();
    node->addNode(7);
    node->addNode(2);
    node->addNode(18);
    node->addNode(6);
    node->addNode(4);
    node->addNode(23);

    bstNode::printBST(node->root);

    return 0;
}

CodePudding user response:

The immediate error is this

                if (focusNode == NULL) {

                    focusNode->left = new bstNode(value);
                }

this is clearly wrong, if a pointer is null you cannot use it. You have this in multiple places. Fix that and then update the question once you have got past that. How did I know this? I ran your code under my debugger and it told me immediatley, you should learn how to get the most out of your debugger.

Next

void addNode(int value)

as a method for a class defined as

class bstNode {

public:
    int value;

is very bad practice. In that method what does value refer to? The argument or the member variable. Get into the habit of giving member variables specific names like this

class bstNode {

public:
    int value_;

Also minor nits. The accepted style for naming classes is with leading Caps like this

class BstNode {

public:
    int value_;

or even

class BSTNode

class bstNode {

public:
    int value_;

CodePudding user response:

using namespace std;

I'd advise against doing this in general. It's hard to be sure what's in namespace std, but the short summary is "a lot, and more all the time", so making all of it visible directly can lead to problems.

bstNode(){};
~bstNode(){};

These don't really accomplish anything useful. The point of a constructor is to initialize the object, but these just leave the object uninitialized, which can lead to problems--especially segmentation faults when/if you try to dereference an uninitialized pointer.

bstNode(int value){

    this->value = value;
    this->left = NULL;
    this->right = NULL;
}

This is better, but I'd prefer to use a member initializer list instead of assignments inside the body of the ctor, and I'd prefer nullptr over NULL:

bstNode(int value) 
    : value(value)
    , left(nullptr)
    , right(nullptr) {}

This next one:

bstNode(int value, bstNode* left, bstNode* right){

    this->value = value;
    this->left = left;
    this->right = right;
}

...is pretty nicely written (though it could also use a member initializer list, which is usually preferable), but only rarely useful when building a binary search tree, because in normal use you only ever insert new leaf nodes, not new internal nodes.

void addNode(int value){

    if (root == NULL){

        bstNode* node = new bstNode(value);
        root = node;
    }
    else{

        bstNode* focusNode = root;
        bstNode* parent;

        while(focusNode != NULL){

            if(value > focusNode->value){

                focusNode = focusNode->right;
                if(focusNode == NULL){
                    
                    focusNode->right = new bstNode(value);
                }
            }
            else{

                focusNode = focusNode->left;
                if(focusNode == NULL){

                    focusNode->left = new bstNode(value);
                }
            }
        }    
    }
}

This is at least one obvious source of a segmentation fault--you dereference a pointer immediately after verifying that it's null.

At least for a first attempt, I think I'd use a recursive implementation, which tends to be simpler:

void addNode(int value, bstNode *&node = root) { 
    if (node == nullptr) {
        node = new node(value);
    } else if (value < node->value) {
        addNode(value, node->left);
    } else if (value > node->value) {
        addNode(value, node->right);
    } else {
        // attempt at inserting duplicate value
    }
}

Note that this passes a reference to a pointer, so we can modify the "current" pointer, rather than having to track the parent pointer while traversing the tree.

static void printBST(bstNode* node){

    while(node != NULL){
        printBST(node->left);
        cout << node->value;
        printBST(node->right);
    }    
}

Since we're doing this recursively, we don't need (or even want) a loop. Traversing the left sub-tree, the current node, and the right subtree traverses the entire tree, with no iteration needed.

Also note that this doesn't print any delimiter between the numbers in the nodes, so a tree containing 12, 34 and a tree containing 1, 2, 3, 4 will both be printed out as 1234, which probably isn't very useful. Fortunately, adding a delimiter is pretty easy.

static void printBST(bstNode* node){

    if (node != nullptr){
        printBST(node->left);
        cout << node->value << ' ';
        printBST(node->right);
    }
}

CodePudding user response:

A segmentation fault normally occurs when one invokes undefined behavior while attempting to access memory in some invalid operation. It is not guaranteed by the language standard that a segmentation fault will be the result of such ill-behavior. That your implementation does so should be considered a luxury. Believe me, it is much harder to find these issues when the code blissfully marches on as if nothing happened.

Anyway, some simple examples of potential undefined behavior that may cause such a gory ending include:

  • dereferencing a null pointer
  • dereferencing an indeterminate pointer (e.g. a "garbage" pointer)
  • breaching beyond the extents of allowed memory (ex: index out of range for an array)
  • attempting to write to read-only memory (ex: a string literal)
  • others not mentioned here.

They are all signs of invoking undefined behavior (not the only such signs). Frankly, the only nice thing about segmentation faults is they're generally fairly direct in their exhibition of evidence regarding improper behavior. This makes them attractive to run to under a debugger. Sometimes the evidence isn't so clear, but usually you can get a good idea of where things went south.

Your program has several possibilities for improper pointer dereferencing due to either uninitialized (indeterminate) pointers, or flat-out blatant disregard for null detection that you did do, and marching straight into dereferencing anyway. Examples include:

  • root is never initialized. Yet it is used in comparison logic (not good) and dereference operations in that state. While the comparison logic is not likely to cause a fault, the decisions made upon those comparisons are still rooted in undefined behavior, and that ain't good.
  • after checking a pointer is definitely NULL, dereferencing it anyway, such as if (focusNode == NULL){ focusNode->left = ... . This problem happens several times.

More to the point, however, a major problem with this code is the overall structure in the first place. I realize some academia source may have you convinced a subtree of a tree is still a tree. But that doesn't mean the container that represents a tree as a whole must also be a node in that tree. Use two classes: a Node class and a Tree class. In short, like this:

#include <iostream>
#include <iomanip>

class bstTree
{
private:
    // internal use only: node type for our tree.
    class bstNode
    {
    public:
        int value;
        bstNode *left;
        bstNode *right;

        // single explicit constructor 
        explicit bstNode(int val = 0, bstNode *l = nullptr, bstNode *r = nullptr)
            : value(val)
            , left(l)
            , right(r)
        {
        }
    };

    // post-order recursive cleanup
    static void freeTree(bstNode *p)
    {
        if (p != nullptr)
        {
            freeTree(p->left);
            freeTree(p->right);
            delete p;
        }
    }

    // inorder recursive print
    static void printTree(std::ostream& outp, bstNode const *p)
    {
        if (p != nullptr)
        {
            printTree(outp, p->left);
            outp << p->value << ' ';
            printTree(outp,p->right);
        }
    }

    // this tree's root pointer (default is null: e.g empty)
    bstNode *root = nullptr;

public:
    bstTree() = default;

    // we do not support these. delete them
    bstTree(const bstTree&) = delete;
    bstTree& operator =(const bstTree&) = delete;

    virtual ~bstTree()
    {
        freeTree(root);
    }

    // adds a new node to the tree
    void addNode(int value)
    {
        bstNode **pp = &root;
        while (*pp)
        {
            if ((*pp)->value < value)
                pp = &(*pp)->right;
            else
                pp = &(*pp)->left;
        }
        *pp= new bstNode(value);
    }

    void print(std::ostream& outp = std::cout)
    {
        if (root)
        {
            printTree(outp, root);
            outp << '\n';
        }
    }
};


// test program
int main()
{
    int data[] = { 7,2,18,6,4,23 };

    bstTree bst;
    for (auto x : data)
    {
        bst.addNode(x);
        bst.print();
    }
}

Output

7 
2 7 
2 7 18 
2 6 7 18 
2 4 6 7 18 
2 4 6 7 18 23 

This follows some basic premises:

  • Responsibility of a node vs a tree is better divided.
  • Nodes can chain to other nodes (left and right), but the tree itself has one starting point, one node: the root. And an "empty" tree has no root (e.g. a null root pointer).

That said, the only part of this code that will probably seem odd is the addNode function, which is radically simplified by using a pointer-to-pointer ideology.

void addNode(int value)
{
    bstNode **pp = &root;
    while (*pp != nullptr)
    {
        if ((*pp)->value < value)
            pp = &(*pp)->right;
        else
            pp = &(*pp)->left;
    }
    *pp= new bstNode(value);
}

The premise is to start at the root of the tree looking for a null pointer on which to hang the new node. We use a pointer-to-pointer to do this enumeration. It makes for a concise, and very efficient solution, with no special cases for empty trees (part of the algorithm already accounts for that implicitly).

  • Initially pp holds the address of the root member. (thus pointer to pointer is required).
  • If the tree is empty root will be null, and thus *pp will also be null. The loop condition (*pp != nullptr) will immediately break.
  • If the tree has nodes, however, the loop will continuously traverse them, further and further down, until it finds some child's left or right that is, in fact, null.
  • Regardless of how it happened (empty tree, or some discovered child left/right that is null), once the loop breaks pp holds the address of the pointer to receive the new node. If the tree was empty it will hold the address of root. Otherwise it will hold the address of some member left or right pointer somewhere in the tree. It doesn't matter which. The final line of code is what allocates the new node and places it at its new home.

CodePudding user response:

In the the following code...

while(focusNode != NULL){

    if(value > focusNode->value){

        focusNode = focusNode->right;
        if(focusNode == NULL){
            
            focusNode->right = new bstNode(value);
        }
    }
    else{

        focusNode = focusNode->left;
        if(focusNode == NULL){

            focusNode->left = new bstNode(value);
        }
    }

}

...you are referencing the children of a node that is guaranteed to be NULL because you verified that using the conditional statement. Since the node itself does not exist, it doesn't have properties like children. Imagine you're trying to communicate with the child of a person who has never existed.

The variable focusNode stores an address of a node. What focusNode->value does is that it goes to the node whose address focusNode stores and retrieves the value property from there.

When focusNode is NULL, it doesn't point to any node, thus you can't go there and retrieve its value property.

I wrote the code that you can replace with your while loop. I have tested it and it works:

while(true){
    if(value > focusNode->value){

        if(focusNode->right == NULL){
            focusNode->right = new bstNode(value);
            return;
        } else focusNode = focusNode->right;
    }
    else{

        if(focusNode->left == NULL){
            focusNode->left = new bstNode(value);
            return;
        } else focusNode = focusNode->left;
    }

}

I also fixed your printBST function. In the printBST function use if instead of while, because the the code inside the while loop would be executed an infinite number of times instead of printing the BST once.

static void printBST(bstNode* node){

    if(node != NULL){
    
        printBST(node->left);
        cout << node->value <<" ";
        printBST(node->right);
    }
}
  •  Tags:  
  • c
  • Related