Home > Software design >  How to select parts of a dataset where a criteria is met in another dataset in python?
How to select parts of a dataset where a criteria is met in another dataset in python?

Time:08-24

I have two datasets in python. One with binary values (i.e the values will be nan in some cases and 1 in the indexes that I want). The other dataset is full of random data. I want to select the data in the second dataset where the values are 1 in the first.

For example, if the dataset 'a' is:

[[nan, nan, nan, nan,],
 [nan, 1, 1, 1,],
 [nan, nan, 1, nan,],
 [nan, nan, nan, nan]]

and 'b' is:

[[1.4, 1.5, 1.8, 1.9],
 [1.1, 1.2, 1.5, 1.8],
 [1.1, 1.3, 1.6, 1.2],
 [1.4, 1.2, 1.8, 1.9]]

then the output should be:

[[nan, nan, nan, nan,],
[nan, 1.2, 1.5, 1.8],
[nan, nan, 1.6, nan,],
[nan, nan, nan, nan,]]

I'm not sure of the exact method to do this

CodePudding user response:

I think You only need:

r = a * b

np.array([1.2,2, np.nan]) * np.array([np.nan, 1, 1])
#array([nan,  2., nan])

CodePudding user response:

In this case @ansev's answer is the best solution because your values multiplicated by 1 are the results you want. If that isn't the case and you have another condition, you can use np.where.

a = np.array([[np.nan, np.nan, np.nan, np.nan,],
 [np.nan, 1, 1, 1,],
 [np.nan, np.nan, 1, np.nan,],
 [np.nan, np.nan, np.nan, np.nan]])

b = np.array([[1.4, 1.5, 1.8, 1.9],
 [1.1, 1.2, 1.5, 1.8],
 [1.1, 1.3, 1.6, 1.2],
 [1.4, 1.2, 1.8, 1.9]])

c = np.where(a==1, b, np.nan)
print(c)
[[nan nan nan nan]
 [nan 1.2 1.5 1.8]
 [nan nan 1.6 nan]
 [nan nan nan nan]]

  • Related