Home > Software engineering >  Get all descendants of an entry
Get all descendants of an entry

Time:03-11

I have a huge referral system (Over 500k entries) that works like this

 [{id: 1, name: "John", ref: 0},
  {id: 2, name: "Jack", ref: 1},
  {id: 3, name: "Bill", ref: 1},
  {id: 5, name: "Jason", ref: 2},
  {id: 6, name: "James", ref: 3},
  {id: 7, name: "Tom", ref: 0}]

Whenever a user joins with other user's referral code, referrer gets some credits and It applies to all levels so in this example John gets credit for this IDs [2, 3, 5, 6]

I use this method to count and organize all entries based on their referral ID

   const countRefs = (list) => {
      return list.reduce((acc, cur) => {
        if(!acc[cur.ref]) acc[cur.ref] = [];
        acc[cur.ref].push(cur.id);
        return acc;
      },{});
    }

And then use this recursive function to get all referred users by an ID.

let depth = 0;
// Keep record of checked IDs
const checked = {};

const getTree = (list, ref) => {
// Check if referrer is already checked to avoid cycles
if(checked[ref]) return [];
  const ids = [];
  const items = list[ref] || [];
  checked[ref] = true;
  if (items.length && depth < 35000) {
    depth  = 1;
    for (let ref of items) {
      if(!list[ref]) continue;
      const deep = getTree(list, ref, depth);
      ids.push(...deep);
    }
  }


  checked = {};
  depth = 0;
  return [...ids, ...items];
}

Now I have two questions:

  1. Is there a better way of doing this? Like creating all relations within the first loop?
  2. With some entries I get Maximum Call Stack Error. Am I doing anything wrong here?

CodePudding user response:

Instead of depth-first you could implement a breadth-first algorithm. In JavaScript a Set will be a nice data structure to work with, as entries of a set are always iterated in insertion-order, and a for..of loop over a set will keep looping as long as new entries are added to the set being looped over, giving it the behaviour of a queue.

A set will also act as checked: if an entry is already in the set, adding it again will not have any effect on the set, and so the entry will not be visited a second time.

No change is needed to countRefs, but I would give it a different name, as it doesn't return a count, but a tree.

And the second function doesn't return a tree, but a list of descendants. So I would also rename that one:

// No change to this function
const makeTree = (list) => {
  return list.reduce((acc, cur) => {
    if (!acc[cur.ref]) acc[cur.ref] = [];
    acc[cur.ref].push(cur.id);
    return acc;
  }, {});
}

// Use breadth-first
const getDescendants = (list, ref) => {
  const children = new Set(list[ref] ?? []);
  for (const child of children) {
    for (const grandchild of list[child] ?? []) children.add(grandchild);
  }
  return [...children];
}

const list = [{id: 1, name: "John", ref: 0},
  {id: 2, name: "Jack", ref: 1},
  {id: 3, name: "Bill", ref: 1},
  {id: 5, name: "Jason", ref: 2},
  {id: 6, name: "James", ref: 3},
  {id: 7, name: "Tom", ref: 0}]

const tree = makeTree(list);
const descendants = getDescendants(tree, 1);
console.log(descendants);

CodePudding user response:

This seems like a good opportunity to use a data structure to write a scalable solution.

  • Build a tree data structure from the reference data

    • Create a node for each unique id
    • For each entry in the data, add a child node id to the node ref

    Note that by definition tree should not have cycles, i.e. an entry should not have its own id as ref

  • Once the tree is in place, the problem boils down to finding the total number of nodes in each of the subtrees of the tree, which is a well studied problem. The time complexity required to solve this is O(n), where n is the total number of nodes.

    In this case, the credit for a particular id will be:

    Number of nodes in the subtree where that id is the root node - 1 (excluding itself).

    Implement the DFS iteratively instead of using recursive calls to avoid stack overflow (no pun intended)

  • Related