I'm working on a d3-force visualisation, which requires data in a specific shape. I've got an array of objects, each with an array of tags.
nodes = [
{ name: "post1", tag_list: ["activity", "online"] },
{ name: "post2", tag_list: ["workshop", "online"] },
{ name: "post3", tag_list: ["english", "workshop"] },
...
]
To establish connections between data, I need to explicitly define an array of links:
links = [
{
source: 'post1',
target: 'post2'
},
{
source: 'post2',
target: 'post3'
},
...
]
There is no difference in similarity between links - all relationships are linear and carry the same "agency". Identical data should ideally be filtered to prevent duplicate lines.
How can I generate a link array of the previously mentioned shape from the tag_list
arrays?
Here's an example of the required data structure.
--
Some context: I'm trying to visualise thematic overlaps between blog pages. All pages have an array of tags to describe them (tag_list
). I wish to connect all tags within the graph. Since d3
requires verbose references to draw links (see link below), I need to compute these from the tag lists that are accessible to me.
CodePudding user response:
You could collect each tag, and for each tag collect the distinct names (in a Set). When such a tag already has names associated to it, iterate those and pair it with the "current" name, putting the lexically smaller name as first pair-member. Store this pair in a map of Sets, so that they are unique.
Here is an implementation:
let nodes = [
{ name: "post1", tag_list: ["activity", "online"] },
{ name: "post2", tag_list: ["workshop", "online"] },
{ name: "post3", tag_list: ["english", "workshop"] },
];
let tags = {};
let pairs = {};
let result = [];
for (let {name, tag_list} of nodes) {
for (let tag of tag_list) {
for (let other of tags[tag] ??= new Set) {
let [source, target] = [name, other].sort();
if (!(pairs[source] ??= new Set).has(target)) {
pairs[source].add(target);
result.push({source, target});
}
}
tags[tag].add(name);
}
}
console.log(result);
CodePudding user response:
You can use the hash grouping approach. First make an object where keys are the hashes of the links, and then use only the values as the result.
const nodes = [
{ name: "post1", tag_list: ["activity", "online"] },
{ name: "post2", tag_list: ["workshop", "online"] },
{ name: "post3", tag_list: ["online"] },
{ name: "post4", tag_list: ["workshop"] },
{ name: "post5", tag_list: ["lonely"] },
];
const hasIntersection = (arrA, arrB) => arrA.some((el) => arrB.includes(el));
const groupedByHash = nodes.reduce((acc, targetNode) => {
const commonNodes = nodes
.filter(({ tag_list }) => hasIntersection(tag_list, targetNode.tag_list))
.filter(({ name }) => name !== targetNode.name);
if (commonNodes.length < 1) return acc;
const commonLinks = commonNodes.reduce((acc, { name }) => {
const [source, target] = [name, targetNode.name].sort();
const hash = [source, target].join('---');
acc[hash] = { source, target };
return acc;
}, {});
return { ...acc, ...commonLinks };
}, {});
const result = Object.values(groupedByHash);
console.log(result);
.as-console-wrapper{min-height: 100%!important; top: 0}