Long For Loop Excecution Time-CodePudding

I am attempting to compare the numerical output of a MATLAB script to that of numpy.

The task, from a homework assignment, is to add the value of 10^-N to itself 10^N times.

The goal is to demonstrate that computers have limitations that will compound in such a calculation. The result should always be 1, but as N increases, the error will also increase. For this problem, N should be 0, 1, 2, ..., 8.

My MATLAB script runs quickly with no issues:

solns = zeros(1, 9);

for N = 0:8
    for ii = 1:(10^N)
        solns(N 1) = solns(N 1)   10^-N;
    end
end

However, using Python 3.8.12 on an Anaconda installation with numpy, this code will not terminate:

import numpy as np

solns = np.zeros((9, 1))

for N in range(len(solns)):
    for _ in range(10**N):
        solns[N]  = (10**-N)

Is there an error in the Python code or is there some significance difference between the languages that creates this result.

EDIT:

I let the python code run for awhile and it eventually terminated after 230 seconds. The whole array printed out as just [1., 1., 1., etc.].

CodePudding user response：

You're rather abusing NumPy there. Especially the = with an array element seems to take almost all of the time.

This is how I'd do it, takes about 0.5 seconds:

from itertools import repeat

for N in range(9):
    print(sum(repeat(10**-N, 10**N)))

Output (Try it online!):

1
0.9999999999999999
1.0000000000000007
1.0000000000000007
0.9999999999999062
0.9999999999980838
1.000000000007918
0.99999999975017
1.0000000022898672

As @AndrasDeak commented, it's fast due to several optimizations. Let's try fewer optimizations as well.

First, only get rid of NumPy, use simple Python floats. Your original btw takes me about 230 seconds as well. This only about 29 seconds:

for N in range(9):
    total = 0
    for _ in range(10**N):
        total  = (10**-N)
    print(total)

Next, don't recompute the added value over and over again. Takes about 11 seconds:

for N in range(9):
    total = 0
    add = 10**-N
    for _ in range(10**N):
        total  = add
    print(total)

Next, get rid of range producing lots of int objects for no good reason. Let's use itertools.repeat, it's the fastest iterable I know. Takes about 9 seconds:

from itertools import repeat

for N in range(9):
    total = 0
    add = 10**-N
    for _ in repeat(None, 10**N):
        total  = add
    print(total)

Or alternatively, about the same speed:

from itertools import repeat

for N in range(9):
    total = 0
    for add in repeat(10**-N, 10**N):
        total  = add
    print(total)

Simply put it into a function to benefit from local variables being faster than globals. Takes about 3.3 seconds then:

from itertools import repeat

def run():
    for N in range(9):
        total = 0
        for add in repeat(10**-N, 10**N):
            total  = add
        print(total)
run()

That might be the fastest I can do with an ordinary clean for loop (loop unrolling would help some more, but ugh...).

My sum(repeat(...)) version lets C code do all the work (and sum even optimizes the summation of floats), making it still quite a bit faster.