I watched a video about code optimization from Tarodev. At the end of the video, I am shocked because I saw that order of operations can seriously change the code speed. Although I tried it myself, I don't understand how can it be possible? How can Float x Float x Vector 2 times faster than Float x Vector x Float or Vector x Float x Float?
I installed Tarodev's project from GitHub and tried it myself. The result was the same as with the video.
EDIT:
Thanks for all these great answers guys. After your answers, I understood the reason and felt dumb :) Also, I'm adding a video Repository and example code at Orhtej2's suggestion
CodePudding user response:
Just first of all
someVector3 * someFloat
and
someFloat * someVector3
both basically return
new Vector3(someVector3.x * someFloat, someVector3.y * someFloat, someVector3.z * someFloat)
so basically 3 float * float
operations constructor call of Vector3
And then the order matters because your operations will be resolved in the order like this
In the first case you do
(float * float) * Vector3
so in steps:
1. (float * float) => single float * operation => result: float 2. (float * Vector3) => 3 float * operations => result: Vector3
in the second you do
(float * Vector3) * float
so in steps:
1. (float * Vector3) => 3 float * operations => result: Vector3 2. (Vector3 * float) => 3 float * operations => result: Vector3
and in the third equivalent you do
(Vector2 * float) * float
so in steps
1. (Vector3 * float) => 3 float * operations => result: Vector3 2. (Vector3 * float) => 3 float * operations => result: Vector3
In the first case you have only 4 *
operations while in the other two cases you have 6.
Plus in the second and third case you also are instantiating an additional Vector3
(the result from the first operation) which also costs performance and memory.
CodePudding user response:
To multiply a float
by a Vector3
, three multiplies are required. To multiply a float
by a float
, clearly just one multiply is requred.
When you do:
result = a * b * c;
That means:
result = (a * b) * c;
or more explicitly:
var tmp = a * b;
result = tmp * c;
Suppose a
and b
are floats and c
is a vector. Then a * b
performs one multiply (float
by float
) and results in a float
, then the second operation performs three multiplies (float
by Vector3
). That's four multiplies in total.
Now support a
is a vector and b
and c
are floats. Then a * b
performs three multiplies (Vector3
by float
) and results in a Vector3
, then the second operation again performs three multiplies (Vector3
by float
) and results in a Vector3
. That's six multiplies in total.
Now, it's possible that the compiler might be able to figure out that these are equivalent and do the cheaper four multiplies instead. But often it's not easy for it to see that. And often it's not quite equivalent. For example, rounding and overflow can depend very much on which way round you do the operations, and the compiler won't mess with them because it could introduce inaccuracies, so it trusts you and simply does exactly what you tell it.
CodePudding user response:
The answer is pretty easy,follow my example:
a Vector is an object made of more than 1 value: e.g. Vector3 v = new Vector3(1,2,3)
if u multiply a number for a Vector3, you are doing 3 operations.
float n = 3.f
Vector3 v = new Vector3(1,2,3)
n*v
result is a Vector3(1*n,2*n,3*n)
,so 3 calculations for one multiplication.
following your example in the first case you are doing n*n*v
,so 1 calculation for n*n 3
calculation for n^2 * v
,for a total of 4 calculations.
in the second case you are doing n * v * n
,that means 3 calculations for n*v
another 3 calculations for v*n
,for a total of 6 calculations.
CodePudding user response:
Vector x Scalar1 x Scalar2 => (Vector x Scalar1) x Scalar2 =>
VectorTemp x Scalar2 = EndResult
So to summarize, we had to operate two times a scalar by vector operation. meaning 2 times a Vector length multiplication.
Where in constrast
Scalar1 x Scalar2 x Vector = (Scalar1 x Scalar2) x Vector =>
Scalar3 x Vector = EndResult.
Here one Vector x Scalar operation is required. Scalar1 by Scalar2 is only multipication
So I guess the key idea here is if operations are commutative, starts with the cheapest ones.