Get all descendants of an entry-CodePudding

I have a huge referral system (Over 500k entries) that works like this

 [{id: 1, name: "John", ref: 0},
  {id: 2, name: "Jack", ref: 1},
  {id: 3, name: "Bill", ref: 1},
  {id: 5, name: "Jason", ref: 2},
  {id: 6, name: "James", ref: 3},
  {id: 7, name: "Tom", ref: 0}]

Whenever a user joins with other user's referral code, referrer gets some credits and It applies to all levels so in this example John gets credit for this IDs [2, 3, 5, 6]

I use this method to count and organize all entries based on their referral ID

   const countRefs = (list) => {
      return list.reduce((acc, cur) => {
        if(!acc[cur.ref]) acc[cur.ref] = [];
        acc[cur.ref].push(cur.id);
        return acc;
      },{});
    }

And then use this recursive function to get all referred users by an ID.

let depth = 0;
// Keep record of checked IDs
const checked = {};

const getTree = (list, ref) => {
// Check if referrer is already checked to avoid cycles
if(checked[ref]) return [];
  const ids = [];
  const items = list[ref] || [];
  checked[ref] = true;
  if (items.length && depth < 35000) {
    depth  = 1;
    for (let ref of items) {
      if(!list[ref]) continue;
      const deep = getTree(list, ref, depth);
      ids.push(...deep);
    }
  }


  checked = {};
  depth = 0;
  return [...ids, ...items];
}

Now I have two questions:

Is there a better way of doing this? Like creating all relations within the first loop?
With some entries I get Maximum Call Stack Error. Am I doing anything wrong here?

CodePudding user response：

Instead of depth-first you could implement a breadth-first algorithm. In JavaScript a Set will be a nice data structure to work with, as entries of a set are always iterated in insertion-order, and a for..of loop over a set will keep looping as long as new entries are added to the set being looped over, giving it the behaviour of a queue.

A set will also act as checked: if an entry is already in the set, adding it again will not have any effect on the set, and so the entry will not be visited a second time.

No change is needed to countRefs, but I would give it a different name, as it doesn't return a count, but a tree.

And the second function doesn't return a tree, but a list of descendants. So I would also rename that one:

// No change to this function
const makeTree = (list) => {
  return list.reduce((acc, cur) => {
    if (!acc[cur.ref]) acc[cur.ref] = [];
    acc[cur.ref].push(cur.id);
    return acc;
  }, {});
}

// Use breadth-first
const getDescendants = (list, ref) => {
  const children = new Set(list[ref] ?? []);
  for (const child of children) {
    for (const grandchild of list[child] ?? []) children.add(grandchild);
  }
  return [...children];
}

const list = [{id: 1, name: "John", ref: 0},
  {id: 2, name: "Jack", ref: 1},
  {id: 3, name: "Bill", ref: 1},
  {id: 5, name: "Jason", ref: 2},
  {id: 6, name: "James", ref: 3},
  {id: 7, name: "Tom", ref: 0}]

const tree = makeTree(list);
const descendants = getDescendants(tree, 1);
console.log(descendants);

CodePudding user response：

This seems like a good opportunity to use a data structure to write a scalable solution.

Build a tree data structure from the reference data
- Create a node for each unique id
- For each entry in the data, add a child node id to the node ref
Note that by definition tree should not have cycles, i.e. an entry should not have its own id as ref
Once the tree is in place, the problem boils down to finding the total number of nodes in each of the subtrees of the tree, which is a well studied problem. The time complexity required to solve this is O(n), where n is the total number of nodes.

In this case, the credit for a particular id will be:

Number of nodes in the subtree where that id is the root node - 1 (excluding itself).

Implement the DFS iteratively instead of using recursive calls to avoid stack overflow (no pun intended)