I have two arrays:
first arr - has multiple level structure.
second arr - linear array with objects.
In my update function I iterate throw my first array childrens and if condition true, I push objects into attributes.
In production - I have arr2.length - 150000 objects and arr1.length - multiple deep levels.
How I can optimize my function to loop more quickly with a big data? Now, I wait about 5 minutes on iteration.
var arr1 = [{
"item_id": 2,
"item_name": "test",
"children": [{
"item_id": 39646,
"item_name": "test1",
"children": [{
"item_id": 35648,
"item_name": "test2",
"children": [{
"item_id": 35771,
"item_name": "test3",
"children": [],
"attributes": []
}],
}]
}]
}]
var arr2 = [
{
"item_id": 35771,
"attr_value": "test",
}, {
"item_id": 35771,
"attr_value": "test1",
}
]
const update = (array, id, object) => array.forEach(o => o.item_id === id ?
o.attributes.push(object) : update(o.children, id, object)
);
for (let item of arr2) {
update(arr1, item.item_id, item);
}
console.log(arr1)
CodePudding user response:
One way would be to create a map that tells you exactly where to find the item in arr1.
Something like this:
const mappedArr1 = {
2: {}, // Reference to item 2
39646: {}, // Reference to item 39646
35648: {}, // Reference to item 35648
35771: {}, // Reference to item 35771
}
And when you update just do:
const item = mappedArr1[id];
item.attributes.push(object);
To create the map maybe use something like this:
const map = array => {
const data = {};
const doMap = array => array.forEach(i => {
data[i.item_id] = i;
if (i.children && i.children.length > 0) {
doMap(i.children);
}
});
doMap(array);
return data;
}
const mappeArr1 = map(arr1);
Here is the full code:
var arr1 = [{
"item_id": 2,
"item_name": "test",
"children": [{
"item_id": 39646,
"item_name": "test1",
"children": [{
"item_id": 35648,
"item_name": "test2",
"children": [{
"item_id": 35771,
"item_name": "test3",
"children": [],
"attributes": []
}],
}]
}]
}]
var arr2 = [
{
"item_id": 35771,
"attr_value": "test",
}, {
"item_id": 35771,
"attr_value": "test1",
}
]
const map = array => {
const data = {};
const doMap = array => array.forEach(i => {
data[i.item_id] = i;
if (i.children && i.children.length > 0) {
doMap(i.children);
}
});
doMap(array);
return data;
}
const mappedArr1 = map(arr1);
for (let i of arr2) {
const item = mappedArr1[i.item_id];
item.attributes.push(i.attr_value);
}
console.log(arr1)
Use JSBench to test the performance of multiple implementations, you will need a bigger data sample: https://jsbench.me
CodePudding user response:
We can walk the tree once, after converting your second array into this format:
{
35771: ['test', 'test1']
// ...
}
and for each node of the tree just adding the attributes from this list if they exist. Here's an implementation:
const updateForest = (atts) => (forest) =>
forest .map (({item_id, children = [], ...rest}) => ({
item_id,
...rest,
children: updateForest (atts) (children),
...(atts [item_id] ? {attributes: atts [item_id]} : {})
}))
const addAttributes = (forest, atts) =>
updateForest (atts .reduce (
(a, {item_id, attr_value}) => ((a [item_id] = a [item_id] || []), (a [item_id] .push (attr_value)), a),
{}
)) (forest)
const arr1 = [{item_id: 2, item_name: "test", children: [{item_id: 39646, item_name: "test1", children: [{item_id: 35648, item_name: "test2", children: [{item_id: 35771, item_name: "test3", children: [], attributes: []}]}]}]}]
const arr2 = [{item_id: 35771, attr_value: "test"}, {item_id: 35771, attr_value: "test1"}]
console .log (addAttributes (arr1, arr2))
.as-console-wrapper {max-height: 100% !important; top: 0}
updateForest
receives the attributes reformatted as described and returns a function which takes your forest structure (it's not a tree, because it's not necessarily singly-rooted) and visits its nodes, adding attributes if they exist, recurring on the list of children.
Our public function is addAttributes
. This uses reduce
to do that format conversion on your attributes, passing it and then the forest to updateForest
.
It's important to not that this does not mutate your original structures, but creates new ones. I find this an important goal when coding. But if your data is so large that we can't fit two copies in memory, then we'd have to skip this approach.