Home > front end >  Why np.nonzero increases the running time if image smaller?
Why np.nonzero increases the running time if image smaller?

Time:09-03

I have been working on an algorithm to detect text regions in an image and implemented two functions to see performance difference between them.

For the first one, it iterates all pixels and checks if its value is "1" in another binary image:

for j in range(edgeImage.shape[0]):
    for i in range(edgeImage.shape[1]):

        if edgeImage[j,i] == 0:
            continue

For the second one I used np.nonzeros to get all non zero pixels since I am not processing zeros:

edgePointRows, edgePointCols =  np.nonzero(edgeImage) 

for index in range(len(edgePointRows)):

    i = edgePointCols[index]
    j = edgePointRows[index]

But when I compare them with timeit, second one takes longer to complete:

all pixels 3.2452468999999997
relevant pixels 3.463474800000001

I tried larger images like 4056x3040 and saw a difference but I wonder why it gets slower when image size reduces?

Am I doing a mistake on loop or timing it?

CodePudding user response:

In the nonzero case you are doing more work: you are looping over the image, finding the non-zero elements, and creating an array with their indices. Then you loop through those indices. In the other case, you never create that array, you just loop over the image once.

But the loop inside nonzero is compiled, and therefore fast, whereas the plain for loops are interpreted Python code, and run much slower. The nonzero case has a smaller Python loop, and so is faster.

So this is a question of when the extra work done in the nonzero case matches the time saved by the smaller Python loop.

CodePudding user response:

numpy methods have some overhead when called(eg memory allocation of the array), which takes a fixed time 'x' regardless of input.
So if the input is sufficiently small enough, the pure python calls can take less than 'x' time making it faster.
Also the second code still uses one for loop where value is being assigned, but the first one there is only comparison.

It is not recommended to use python loops to assign numpy ndarray's, try using numpy methods.

You Can Create a Mask for the image

# mask will be an array of bool, True for every nonzero values, False othervise
mask = imgArray != 0
  • Related