For context I was solving the day 6 problem from the 2021 Advent of Code, and wanted to try using numpy arrays over python lists since from my current understanding they should be faster. But ran into an issue, my solution prints the correct answer but takes ages to finish computing as the number_of_days_to_cycle_through variable scales.

I wanted help in understanding why the incredibly long scaling was occurring, and how to notice/prevent that mistake in my code going forward?(the lantern_fish_array is a numpy array of int64)

def iterate_through_one_day(lantern_fish_array):
     iterator = 0

     copy_of_lantern_fish_array = lantern_fish_array.copy()
     for fish in copy_of_lantern_fish_array:
         if fish == 0:
             lantern_fish_array = np.append(lantern_fish_array, 8)
             lantern_fish_array[iterator] = 6
         else:
             lantern_fish_array[iterator] -= 1
         iterator  = 1
     del copy_of_lantern_fish_array

     return new_lantern_fish_array


def solve_part_1(lantern_fish_array):
     num_of_days_to_cycle_through = 256
     while num_of_days_to_cycle_through != 0:
         lantern_fish_array = iterate_through_one_day(lantern_fish_array)
         num_of_days_to_cycle_through -= 1
     return lantern_fish_array.size

CodePudding user response：

You can do something like:

import numpy as np

# numpy style
def iterate_through_one_day_np(lantern_fish_array):
    new_array=np.copy(lantern_fish_array)
    mask_0 = new_array==0
    
    new_array[mask_0] = 6 
    new_array[~mask_0] -= 1  # or new_array[~mask_0] = new_array[~mask_0] - 1

    new_array = np.append(new_array, [8]*np.sum(mask_0))
    return new_array

# your code for reference
def iterate_through_one_day(lantern_fish_array):
    iterator = 0
    new_array=np.copy(lantern_fish_array)
    for fish in lantern_fish_array:
        if fish == 0:
            new_array = np.append(new_array, 8)
            new_array[iterator] = 6
        else:
            new_array[iterator] -= 1
        iterator  = 1
    
    return new_array

lantern_fish_array = [0,1,2,3,4,5]

iterate_through_one_day(lantern_fish_array)
# array([6, 0, 1, 2, 3, 8])

iterate_through_one_day(lantern_fish_array)
# array([6, 0, 1, 2, 3, 8])

Speed testing

Get a bit more fishes, i.e. 50 times the list with: [0,1,2,3,4,5]*50

%%timeit -r 3 -n 3
iterate_through_one_day_np([0,1,2,3,4,5]*50)
# 94.4 µs ± 4.89 µs per loop (mean ± std. dev. of 3 runs, 3 loops each)

vs.

%%timeit -r 3 -n 3
iterate_through_one_day([0,1,2,3,4,5]*50)
# 878 µs ± 154 µs per loop (mean ± std. dev. of 3 runs, 3 loops each)

CodePudding user response：

Numpy arrays are fast due to operations being done in parallel (vectorized). For using looping they might well be slower than a list. So unless your operation can be expressed in a parallel manner (vectorized) you probably won't have any benefit when using a looping/iterative operation on a np array versus the same operation on a list.

See these resources on broadcasting and vectorization,

https://realpython.com/numpy-array-programming/

https://unidata.github.io/python-training/workshop/NumPy/numpy-broadcasting-and-vectorization/

Also some suggestions for when you can't avoid doing looping,

Fast Numpy Loops

CodePudding user response：

For array with many occurrences of 0, use numpy.where is much faster than modifying the value through index twice. For array with less occurrences of 0, its performance is also slightly faster than method of LHans:

>>> def one_day_use_where(lantern_fish_array):
...     mask = lantern_fish_array == 0
...     return np.concatenate([np.where(mask, 6, lantern_fish_array - 1), np.repeat(8, mask.sum())])
...

Some test:

>>> def one_day_use_index(lantern_fish_array):
...     mask = lantern_fish_array == 0
...     new_array = lantern_fish_array.copy()
...     new_array[mask] = 6
...     new_array[~mask] -= 1
...     return np.concatenate([new_array, np.repeat(8, mask.sum())])
...
>>> a = np.random.randint(0, 10, 10000)   # 0 accounts for about 10%
>>> timeit(lambda: one_day_use_where(a), number=10000)
0.4181383000104688
>>> timeit(lambda: one_day_use_index(a), number=10000)
0.8232910000078846
>>> a = np.random.randint(0, 2, 10000)    # 0 accounts for about 50%
>>> timeit(lambda: one_day_use_where(a), number=10000)
0.544302800000878
>>> timeit(lambda: one_day_use_index(a), number=10000)
1.917074600001797
>>> a = np.random.randint(1, 3, 10000)    # Does not contain 0
>>> timeit(lambda: one_day_use_where(a), number=10000)
0.38596799998776987
>>> timeit(lambda: one_day_use_index(a), number=10000)
0.3989579000044614
>>> a = np.zeros(10000, dtype=int)
>>> # If the proportion of 0 is too high, the performance will be slightly worse
>>> timeit(lambda: one_day_use_np_where(a), number=10000)
0.6532589000125881
>>> timeit(lambda: one_day_use_index(a), number=10000)
0.5977481000008993