Fastest way to fill numpy array with new arrays from function-CodePudding

I have a function f(a) that takes one entry from a testarray and returns an array with 5 values:

f(testarray[0])
#Output: array([[0, 1, 5, 3, 2]])

Since f(testarray[0]) is the result of an experiment, I want to run this function f for each entry of the testarray and store each result in a new numpy array. I always thought this would be quite simple by just taking an empty numpy array with the length of the testarray and save the results the following way:

N = 1000 #Number of entries of the testarray
test_result  = np.zeros([N, 5], dtype=int)

for i in testarray:
        test_result[i] = f(i)

When I run this, I dont receive any error message but nonsense results (half of the test_result is empty while the rest is filled with unplausible values). Since f() works perfectly for a single entry of the testarray I suppose that something of the way of how I save the results in the test_result is wrong. What am I missing here?

(I know that I could save the results as list and then append an empty list, but this method is too slow for the large number of times I want to run the function).

CodePudding user response：

Since you don't seem to understand indexing, stick with this approach

alist = [f(i) for i in testarray]
arr = np.array(alist)

I could show how to use row indices and testarray values together, but that requires more explanation.

CodePudding user response：

Your problem may could be reproduced by the following small example:

testarray = np.array([5, 6, 7, 3, 1])

def f(x):
    return np.array([x * i for i in np.arange(1, 6)])

f(testarray[0])
# [ 5 10 15 20 25]

test_result = np.zeros([len(testarray), 5], dtype=int)  # len(testarray) or testarray.shape[0]

So, as hpaulj mentioned in the comments, you must be careful how to use indexing:

for i in range(len(testarray)):
    test_result[i] = f(testarray[i])

# [[ 5 10 15 20 25]
#  [ 6 12 18 24 30]
#  [ 7 14 21 28 35]
#  [ 3  6  9 12 15]
#  [ 1  2  3  4  5]]

There will be another condition where the testarray is a specified index array that contains shuffle integers from 0 to N to full fill the zero array i.e. test_result. For this condition we can create a reproducible example as:

testarray = np.array([4, 3, 0, 1, 2])

def f(x):
    return np.array([x * i for i in np.arange(1, 6)])

f(testarray[0])
# [ 4  8 12 16 20]

test_result = np.zeros([len(testarray), 5], dtype=int)

So, using your loop will get the following result:

for i in testarray:
    test_result[i] = f(i)
# [[ 0  0  0  0  0]
#  [ 1  2  3  4  5]
#  [ 2  4  6  8 10]
#  [ 3  6  9 12 15]
#  [ 4  8 12 16 20]]

As it can be understand from this loop, if the index array be not from 0 to N, some rows in the zero array will left zero (unchanged):

testarray = np.array([4, 2, 4, 1, 2])

for i in testarray:
    test_result[i] = f(i)
# [[ 0  0  0  0  0]
#  [ 1  2  3  4  5]
#  [ 2  4  6  8 10]
#  [ 0  0  0  0  0]   # <--
#  [ 4  8 12 16 20]]