How Order of Operations Can Seriously Change The Code Speed?-CodePudding

I watched a video about code optimization from Tarodev. At the end of the video, I am shocked because I saw that order of operations can seriously change the code speed. Although I tried it myself, I don't understand how can it be possible? How can Float x Float x Vector 2 times faster than Float x Vector x Float or Vector x Float x Float?

I installed Tarodev's project from GitHub and tried it myself. The result was the same as with the video.

EDIT:

Thanks for all these great answers guys. After your answers, I understood the reason and felt dumb :) Also, I'm adding a video Repository and example code at Orhtej2's suggestion

CodePudding user response：

Just first of all

someVector3 * someFloat

and

someFloat * someVector3

both basically return

new Vector3(someVector3.x * someFloat, someVector3.y * someFloat, someVector3.z * someFloat)

so basically 3 float * float operations constructor call of Vector3

And then the order matters because your operations will be resolved in the order like this

In the first case you do

(float * float) * Vector3

so in steps:

1. (float * float) => single float * operation => result: float
2. (float * Vector3) => 3 float * operations => result: Vector3

in the second you do

(float * Vector3) * float

so in steps:

1. (float * Vector3) => 3 float * operations => result: Vector3
2. (Vector3 * float) => 3 float * operations => result: Vector3

and in the third equivalent you do

(Vector2 * float) * float

so in steps

1. (Vector3 * float) => 3 float * operations => result: Vector3
2. (Vector3 * float) => 3 float * operations => result: Vector3

In the first case you have only 4 * operations while in the other two cases you have 6.

Plus in the second and third case you also are instantiating an additional Vector3 (the result from the first operation) which also costs performance and memory.

CodePudding user response：

To multiply a float by a Vector3, three multiplies are required. To multiply a float by a float, clearly just one multiply is requred.

When you do:

result = a * b * c;

That means:

result = (a * b) * c;

or more explicitly:

var tmp = a * b;
result = tmp * c;

Suppose a and b are floats and c is a vector. Then a * b performs one multiply (float by float) and results in a float, then the second operation performs three multiplies (float by Vector3). That's four multiplies in total.

Now support a is a vector and b and c are floats. Then a * b performs three multiplies (Vector3 by float) and results in a Vector3, then the second operation again performs three multiplies (Vector3 by float) and results in a Vector3. That's six multiplies in total.

Now, it's possible that the compiler might be able to figure out that these are equivalent and do the cheaper four multiplies instead. But often it's not easy for it to see that. And often it's not quite equivalent. For example, rounding and overflow can depend very much on which way round you do the operations, and the compiler won't mess with them because it could introduce inaccuracies, so it trusts you and simply does exactly what you tell it.

CodePudding user response：

The answer is pretty easy,follow my example:

a Vector is an object made of more than 1 value: e.g. Vector3 v = new Vector3(1,2,3)

if u multiply a number for a Vector3, you are doing 3 operations.

float n = 3.f
Vector3 v = new Vector3(1,2,3)

n*v result is a Vector3(1*n,2*n,3*n),so 3 calculations for one multiplication. following your example in the first case you are doing n*n*v,so 1 calculation for n*n 3 calculation for n^2 * v,for a total of 4 calculations.

in the second case you are doing n * v * n,that means 3 calculations for n*v another 3 calculations for v*n,for a total of 6 calculations.

CodePudding user response：

Vector x Scalar1 x Scalar2  => (Vector x Scalar1) x Scalar2 => 
VectorTemp x Scalar2 = EndResult

So to summarize, we had to operate two times a scalar by vector operation. meaning 2 times a Vector length multiplication.

Where in constrast

Scalar1 x Scalar2 x Vector = (Scalar1 x Scalar2) x Vector => 
Scalar3 x Vector = EndResult.

Here one Vector x Scalar operation is required. Scalar1 by Scalar2 is only multipication

So I guess the key idea here is if operations are commutative, starts with the cheapest ones.