Home > Back-end >  Computing array of relationships from tags
Computing array of relationships from tags

Time:03-10

I'm working on a d3-force visualisation, which requires data in a specific shape. I've got an array of objects, each with an array of tags.

nodes = [
  { name: "post1", tag_list: ["activity", "online"] },
  { name: "post2", tag_list: ["workshop", "online"] },
  { name: "post3", tag_list: ["english", "workshop"] },
  ...
]

To establish connections between data, I need to explicitly define an array of links:

links = [
  { 
    source: 'post1', 
    target: 'post2' 
  },
  { 
    source: 'post2', 
    target: 'post3' 
  },
  ...
]

There is no difference in similarity between links - all relationships are linear and carry the same "agency". Identical data should ideally be filtered to prevent duplicate lines.

How can I generate a link array of the previously mentioned shape from the tag_list arrays?

Here's an example of the required data structure.

--

Some context: I'm trying to visualise thematic overlaps between blog pages. All pages have an array of tags to describe them (tag_list). I wish to connect all tags within the graph. Since d3 requires verbose references to draw links (see link below), I need to compute these from the tag lists that are accessible to me.

CodePudding user response:

You could collect each tag, and for each tag collect the distinct names (in a Set). When such a tag already has names associated to it, iterate those and pair it with the "current" name, putting the lexically smaller name as first pair-member. Store this pair in a map of Sets, so that they are unique.

Here is an implementation:

let nodes = [
  { name: "post1", tag_list: ["activity", "online"] },
  { name: "post2", tag_list: ["workshop", "online"] },
  { name: "post3", tag_list: ["english", "workshop"] },
];

let tags = {};
let pairs = {};
let result = [];
for (let {name, tag_list} of nodes) {
    for (let tag of tag_list) {
        for (let other of tags[tag] ??= new Set) {
            let [source, target] = [name, other].sort();
            if (!(pairs[source] ??= new Set).has(target)) {
                pairs[source].add(target);
                result.push({source, target});
            }
        }
        tags[tag].add(name);
    }
}

console.log(result);

CodePudding user response:

You can use the hash grouping approach. First make an object where keys are the hashes of the links, and then use only the values as the result.

const nodes = [
  { name: "post1", tag_list: ["activity", "online"] },
  { name: "post2", tag_list: ["workshop", "online"] },
  { name: "post3", tag_list: ["online"] },
  { name: "post4", tag_list: ["workshop"] },
  { name: "post5", tag_list: ["lonely"] },
];

const hasIntersection = (arrA, arrB) => arrA.some((el) => arrB.includes(el));

const groupedByHash = nodes.reduce((acc, targetNode) => {
  const commonNodes = nodes
    .filter(({ tag_list }) => hasIntersection(tag_list, targetNode.tag_list))
    .filter(({ name }) => name !== targetNode.name);

  if (commonNodes.length < 1) return acc;
  
  const commonLinks = commonNodes.reduce((acc, { name }) => {
    const [source, target] = [name, targetNode.name].sort();
    const hash = [source, target].join('---');
    acc[hash] = { source, target };
    return acc;
  }, {});
  
  return { ...acc, ...commonLinks };
}, {});

const result = Object.values(groupedByHash);


console.log(result);
.as-console-wrapper{min-height: 100%!important; top: 0}

  • Related