While loop overperforming for loop

I'm writing a simple integration approximation function. Thing is that no matter what I do in my code, it seems that my while implementation always beats my for implementation which is very weird, since for should be faster as it does not check a boolean expression and increase a variable on each iteration.

The code:

import math
import time
import numpy as np

def function(x):
    return math.cos(x)

def integrate_while(func, x_start, x_end, n_steps=10_000):
    step_length = (x_end - x_start) / n_steps
    area = 0
    x = x_start
    x_end -= step_length
    while x < x_end:
        area  = func(x)   func(x   step_length)
        x  = step_length
    area *= step_length / 2
    return area

def integrate_for(func, x_start, x_end, n_steps=10_000):
    step_length = (x_end - x_start) / n_steps
    area = 0
    ls = np.linspace(x_start, x_end, n_steps)
    for x in ls:
        area  = func(x)   func(x   step_length)
    area *= step_length / 2
    return area

def integrate_for_range(func, x_start, x_end, n_steps=10_000):
    step_length = (x_end - x_start) / n_steps
    area = 0
    for i in range(n_steps):
        x = x_start   i * step_length
        area  = func(x)   func(x   step_length)
    area *= step_length / 2
    return area

def test():
    integrate_funcs = [integrate_while, integrate_for, integrate_for_range]
    for integrate_func in integrate_funcs:
        t1 = time.time_ns()
        result = integrate_func(function, 0, math.pi / 2, n_steps=1_000_000)
        t2 = time.time_ns()
        print(f'Function {integrate_func.__name__}. Result: {round(result, 4)}, Elapsed ns: {t2-t1:,}.')

test()

Results:

Function integrate_while. Result: 1.0, Elapsed ns: 569,587,400.
Function integrate_for. Result: 1.0, Elapsed ns: 638,829,800.
Function integrate_for_range. Result: 1.0, Elapsed ns: 596,499,300.

Edit: Already checked the np.linspace object creation impact on total excecution time of the for loop and it is less than 1% of total time.

CodePudding user response：

Original np.linspace was the problem since numpy arrays are not thought to be iterated as there is important overhead when you do so.

After changing np.linspace with native range type, I could overperform the simple while loop.

Also some extra changes improved performance too:

Setting variable types before entering into for loop (10% less time).
Change loop logic to avoid extra function calls (50% less time).

Final function code:

def integrate_for_range_opt(func, x_start, x_end, n_steps=10_000):
    step_length = (x_end - x_start) / n_steps
    area = 0.0
    x = float(x_start)   step_length
    for i in range(n_steps - 1):
        area = area   func(x)
        x = x   step_length
    area = area * 2.0   func(x_start)   func(x_end)
    return area * step_length / 2.0

CodePudding user response：

Your question doesn't actually ask a question and to me it seemed to ask for an explanation, but your comments and your own answer give me the impression that what you really want is a fast solution. So here's a faster one, making Python functions do more work for you:

from itertools import accumulate, repeat

def integrate_map_accumulate(func, x_start, x_end, n_steps=10_000):
    step_length = (x_end - x_start) / n_steps
    xs = accumulate(repeat(step_length, n_steps - 2),
                    initial = x_start   step_length)
    area = sum(map(func, xs))
    area = area * 2.0   func(x_start)   func(x_end)
    return area * step_length / 2.0

In my testing with your benchmark code, that reduced the time from ~0.21 seconds (for your answer's solution) to ~0.14 seconds.