numpy array vectorization [false true ] refers to [0 1]-CodePudding

import numpy as np

#below code converts non neg numbers to 0 and positive 1 in the numpy array

x = np.array([1,4,5,-5,4])
x[x>0] = 1
x[x<=0] = 0

#it turns out I can do it also like this

x = (x>0) * 1

It is not shocking that False is 0 or True is 1 but I heard from my lecturers that we should not use them (False and True) as integers so, which code is better ?

CodePudding user response：

Regardless of the method, make sure you have a clear idea of what the variables are at each step. For arrays that means a clear idea of the dtype.

In [26]: x = np.array([1,4,5,-5,4])

Applying the > test to the int dtype array produces a boolean dtype array of the same shape

In [27]: (x>0)
Out[27]: array([ True,  True,  True, False,  True])

Perhaps the clearest way to get the int array is to use the astype casting:

In [28]: (x>0).astype(int)
Out[28]: array([1, 1, 1, 0, 1])

The multiplication does that casting implicitly, since the boolean has to be converted to int to do the multiply. It works just as well, but might not as clear to humans:

In [29]: (x>0)*1
Out[29]: array([1, 1, 1, 0, 1])

There are places where masked assignment is desirable, but here I'd say it's overkill:

In [30]: x[x>0] = 1
    ...: x[x<=0] = 0
In [31]: x
Out[31]: array([1, 1, 1, 0, 1])

It's also slower, since it has to do the x<=0 test as well.

A cleaner version of that masking is:

In [32]: np.where(x>0, 1, 0)
Out[32]: array([1, 1, 1, 0, 1])

CodePudding user response：

Every computer language has its own idioms. Common ways of doing things that are not necessarily the "prettiest" way of doing something, but that is well enough known within that community. One common case is to write x[:] instead of x.copy() to copy an array. The latter is more expressive of your intent, but the former is shorter.

Likewise, it is common knowledge that True and False convert to 1 and 0 in arithmetic expressions. You are as likely to see. You are equally likely to see either of these:

sum(1 for x in array if x > 0)

sum(x > 0 for x in array)

Which is better? That's a master of taste. To most seasoned Python programmers, both are equally readable.