Home > OS >  Caching the result of modulus operations but why that doesn't improve performance?
Caching the result of modulus operations but why that doesn't improve performance?

Time:09-17

I have two similar loops in python. First one is exactly same as second one except that I attempted to optimize second loop by remembering result of repeated operation(s) in a variable

from timeit import default_timer as timer

loop1_time = 0
loop2_time = 0

for i in range(101):

    start = timer()
    for loop_number in range(50):
        if loop_number % 3 == 0 and loop_number % 5 == 0:
            a = 1
            # print('FizzBuzz {} ' .format(loop_number))
        elif loop_number % 5 == 0:
            # print('Buzz {} '.format(loop_number))
            b = 2
        elif loop_number % 3 == 0:
            # print('Fizz {}' .format(loop_number))
            c = 3
    end = timer()
    loop1_time = loop1_time   (end - start)
    # print("First loop took "   str(end - start))

    start = timer()
    for loop_number in range(50):
        answer_of_3 = loop_number % 3
        answer_of_5 = loop_number % 5

        if answer_of_3 == 0 and answer_of_5 == 0:
            # print('FizzBuzz {} ' .format(loop_number))
            d = 1
        elif answer_of_3 == 0:
            # print('Buzz {} '.format(loop_number))
            e = 2
        elif answer_of_5 == 0:
            # print('Fizz {}' .format(loop_number))
            f = 3
    end = timer()
    loop2_time = loop2_time   (end - start)
    # print("Second loop took "   str(end - start))

print("First loop on average took "   str(loop1_time / 101))
print("Second loop on average took "   str(loop2_time / 101))

The two prints in the end consistently show that second loop was a little slower than first loop. However by saving the results of a couple pf operations to avoid repeating the execution of those operation I was under the impression that Second loop would be faster.

Any thoughts on why Second loop is slower?

CodePudding user response:

I was able to replicate your results when copying the code and running the Python script as is: The first version is slightly faster. Putting these inside a function and using timeit, however, gives different results:

import timeit
from functools import partial
def f1(reps):
    for loop_number in range(reps):
        if loop_number % 3 == 0 and loop_number % 5 == 0:
            a = 1
        elif loop_number % 5 == 0:
            b = 2
        elif loop_number % 3 == 0:
            c = 3

def f2(reps):
    for loop_number in range(reps):
        answer_of_3 = loop_number % 3
        answer_of_5 = loop_number % 5

        if answer_of_3 == 0 and answer_of_5 == 0:
            a = 1
        elif answer_of_5 == 0:
            b = 2
        elif answer_of_3 == 0:
            c = 3

print(min(timeit.Timer(partial(f1, 1000)).repeat(5, 10000)))
print(min(timeit.Timer(partial(f2, 1000)).repeat(5, 10000)))
# 1.0620667609982775
# 0.9467761240011896

Now the second loop is slightly faster. Even if you repeat your test exactly, but put it inside a function, the second loop is still faster.

The reason is specific to CPython (and also varies between compiled and interpreted languages). The question of "how fast is a variable lookup" is different between locals and globals, as code running inside functions can look up local variables (especially attributes, which are not relevant here) in its local array faster than global lookups, which use a global dictionary.

So yes, your intuition about caching being faster is often correct for code running inside functions, which hopefully is most of your code. But performance and optimizations can be extremely hard to predict beforehand, especially with interpreted language implementations like Python, where, famously,

x  = x
x *= 2
x <<= 1

all run at different speeds despite being equivalent.

  • Related