counting number of elements less than X in a BST-CodePudding

I had implemented a BST for a multiset using the C code below, whereas each node contains the number of occurrence num of each distinct number data, and I try to find the number of elements less than certain value x, using the order function below.

It works, however, inefficient in terms of execution time. Is there any method with better time complexity?

struct Node {
    int data;
    int height;
    int num;

    Node *left;
    Node *right;
};

int order(Node *n, int x) {
    int sum = 0;
    if (n != NULL) {
        if (n->data < x) {
            sum  = n->num;
            sum  = order(n->right, x);
        }
        sum  = order(n->left, x);
    }
    return sum;
}

CodePudding user response：

You can bring the algorithm down to O(logN) time by storing in each node the number of elements in the subtree of which it is the root. Then you'd only have to recurse on one of the two children of each node (go left if x < node->data, right if x > node->data), which if the tree is balanced only takes logarithmic time.

struct Node {
    int data;
    int height;
    int num;
    int size; // numer of elements in the subtree starting at this node

    Node *left;
    Node *right;
};

int order(Node *n, int x) {
    if(n == NULL) return 0;

    // elements less than n->data make up the whole left subtree
    if (x == n->data) {
        return n->left ? n->left->size : 0;
    }
    // even smaller? just recurse left
    else if (x < n->data) {
        return order(n->left, x);
    }
    // bigger? take the whole left subtree and part of the right one
    else {
        return (n->left ? n->left->size : 0)   order(n->right, x);
    }
}

Of course, now you have to keep track of the size, but this can be done very efficiently when updating the tree: simply recalculate the size (n->left->size n->right->size 1) of each modified node in an insertion or deletion.

CodePudding user response：

Unless you can change the BST structure or create a meta-structure around it, your best bet is to remove the recursion by flattening the algorithm, e.g.:

int order(Node *n, int x) {
    if (n == NULL) return 0;

    int sum = 0;

    if (n->data >= x) {

        // Search left until we find the first node < x
        while (n != NULL && n->data >= x) {
            n = n->left;
        }

        // Add numbers of all left nodes up to the leftmost one
        while (n != NULL) {
            sum  = n->num;
            n = n->left;
        }

    }
    else {
        // n->data < x

        // Add numbers of all left nodes up to the leftmost one
        Node *m = n;
        while (m != NULL) {
            sum  = m->num;
            m = m->right;
        }

        // Add numbers of all right nodes as long as they are < x
        n = n->right;
        while (n != NULL && n->data < x) {
            sum  = n->num;
            n = n->right;
        }

    }

    return sum;
}

The algorithm is much simpler if you have access to the leftmost node instead of passing a random node in the middle of the tree (random as in "random access memory", not as in "random number"):

int order(Node *leftmost, int x) {
    int sum = 0;
    for (Node *n = leftmost; n != NULL && n->data < x; n = n->right)
        sum  = n->num;
    return sum;
}