for the sake of simplicity i will put a simple example, i have an $array
and few key and values that i wanna add to this array.. what is better primarily from performance perspective:
- to add all of those key values in one statement to the array. or
- it won't harm to just do them one by one.
1
$array = [
$key1 => $value1,
$key2 => $value2
];
OR
$array[$key1] = $value1;
$array[$key2] = $value2;
CodePudding user response:
If it was a suspicious bottleneck (perhaps millions of items?) I would go straight to do a little performance test. See this little script, comparing multiple key updates once array[key = , , ]
and multiple separate key updates in consecutive statements array[key] =
when opdating 1.000.000 times :
$time_start = microtime(true);
$a1 = array();
for ($i = 0; $i < 1000000; $i) {
$a1 = [
'key1' => $i,
'key2' => $i 1
];
}
$time_end = microtime(true);
printf('Took %f seconds for inline array[key = ]<br>', $time_end - $time_start);
$time_start = microtime(true);
$a2 = array();
for ($i = 0; $i < 1000000; $i) {
$a2['key1'] = $i;
$a2['key2'] = $i 1;
}
$time_end = microtime(true);
printf('Took %f seconds for array[key] = <br>', $time_end - $time_start);
That gives me (the picture is more or less the same on each run) :
Took 0.195255 seconds for inline array[key = ]
Took 0.204276 seconds for array[key] =
So, it really doesn't matter - no noticeable difference you have to worry about - but updating multiple keys in one statement seems to be a little bit faster. Most of the times, not guaranteed always.
And that is also exactly what we could expect! Think about it logically: Updating the array keys in one statement is slightly more efficient than updating the same array keys by multiple consecutive statements, simply because the array in memory are accessed fewer times.
CodePudding user response:
If you have a handful of keys/values, it will make absolutely no difference. If you are adding tens or hundreds of thousands of members, there may be a minute difference (in the order of milliseconds), but I wouldn't know which way without benchmarking or digging into the internals. To keep things in context:
$r = [];
for($i = 1; $i <= 100000; $i ) {
$r[] = $i; // for numerically indexed array
// $r["k_{$i}"] = $i; // for associative array
// array_push($r, $i); // with function call
}
This generates an array with 100000
members, one-by-one. When added with a numeric (auto)index, this loop takes ~0.0025
sec on my laptop, with memory peak at ~6.8MB. If I use array_push, it takes ~0.0065 sec with the function overhead. When $i
is added with a named key, it takes ~0.015
sec with memory peak at ~12.8MB. Then, named keys work slower.
But would it make a difference if you shaved 0.015 sec to 0.012 sec? Or even, 0.15 sec to 0.12 sec? Not really. What you actually do with that volume of data will take much longer, and should be the focus of your optimization efforts.
N.B. You could manually prep an array of 100K members in one set, compared to one-by-one, then include each, and see if you gain some real performance manna in the process. I doubt that very much. (Stay tuned, updating this space with a test.)
Bottom line: Simply construct your arrays in a manner that makes your code easy to maintain. I personally find it "cleaner" to keep simple arrays "contained":
$data = [
'length' => 100,
'width' => 200,
'foobar' => 'possibly'
];
Sometimes your array needs to "refer to itself" and the "one-by-one" format is necessary:
$data['length'] = 100;
$data['width'] = 200;
$data['square'] = $data['length'] * $data['width'];
Sometimes you build multidimensional arrays, may be "cleaner" to separate each main member:
$data = [];
$data['shapes'] = ['square', 'triangle', 'octagon'];
$data['sizes'] = [100, 200, 300, 400];
$data['colors'] = ['red', 'green', 'blue'];
On a final note, by far the more limiting performance factor with PHP arrays is memory usage (see: array hashtable internals), which is unrelated to how you build your arrays. If you have massive datasets in arrays, make sure you don't keep unnecessary modified copies of them floating around beyond their scope of relevance. Otherwise your memory usage will rocket.