How to use 'greater than' on two numpy arrays?-CodePudding

How would I go about comparing these two arrays in Python using 'greater than' > ?

I want to say that if the value of array[i] is > the value of array_two[i], then assign 1 to a df (this part isn't important, the condition '>' is).

array([6.10486534e-02, 3.20790148e-01, 2.56859660e-01, 2.56859660e-01,
       3.03715386e-01, 3.58806774e-01, 3.76682551e-01, 1.37473505e-01,
       3.58806774e-01, 3.71317921e-04, 3.45058974e-01, 3.45058974e-01,
       8.81391314e-02, 3.20790148e-01, 9.42457566e-03, 8.81391314e-02,
       3.95105724e-01, 5.34622633e-03, 2.08727973e-01, 2.91399310e-03,
       8.81391314e-02, 2.08727973e-01, 2.56859660e-01, 3.85616747e-01,
       1.62975022e-01, 3.58806774e-01, 2.08727973e-01, 2.56859660e-01,
       1.59636395e-02, 3.58806774e-01, 3.85616747e-01, 1.80399797e-01,
       3.76682551e-01, 3.45058974e-01, 8.81391314e-02, 3.58806774e-01,
       1.22269205e-01, 9.42457566e-03, 1.62975022e-01, 3.71317921e-04,
       3.20790148e-01, 3.98205068e-01, 3.20790148e-01, 4.06292733e-02,
       3.95105724e-01, 2.56859660e-01, 3.98205068e-01, 3.45058974e-01,
       8.81391314e-02, 1.00660158e-01])

array([0.3709119 , 0.06697823, 0.35351773, 0.35351773, 0.31950921,
       0.09175405, 0.23816167, 0.01401676, 0.09175405, 0.05914856,
       0.28009387, 0.28009387, 0.39048359, 0.06697823, 0.22254767,
       0.39048359, 0.1964211 , 0.18148102, 0.37939101, 0.14354477,
       0.39048359, 0.37939101, 0.35351773, 0.12191716, 0.39492163,
       0.09175405, 0.37939101, 0.35351773, 0.26470548, 0.09175405,
       0.12191716, 0.02169437, 0.23816167, 0.28009387, 0.39048359,
       0.09175405, 0.3987336 , 0.22254767, 0.39492163, 0.05914856,
       0.06697823, 0.1571276 , 0.06697823, 0.3417329 , 0.1964211 ,
       0.35351773, 0.1571276 , 0.28009387, 0.39048359, 0.00878408])

I get this error if I try array > array_two

'The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()'

CodePudding user response：

Maybe here is what you want:

df_arr = arr1 > arr2  #Compare two array elements and assign result into new array.
print(df_arr)

df_int_arr = df_arr.astype(int)  #Change the elements True into 1 and False into 0.
print(df_int_arr)

And here is the output:

[False  True False False False  True  True  True  True False  True  True
 False  True False False  True False False False False False False  True
 False  True False False False  True  True  True  True  True False  True
 False False False False  True  True  True False  True False  True  True
 False  True]

[0 1 0 0 0 1 1 1 1 0 1 1 0 1 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 1 1 1 1 1 0 1 0
 0 0 0 1 1 1 0 1 0 1 1 0 1]

You may check my blog to get more information about the Bool index in Chapter Four. And here is the official guidance about array indexing and slicing.

Hope this answer helps you~

CodePudding user response：

Should really post code that shows your issue, rather than just two series that doesn't explain what you did to get that error...but in general, something like this might be what u want

df = pd.DataFrame({'col1':arr1, 'col2': arr2})
m = df.col1 > df.col2
df['col1']= df.col1.where(m,1)

Might also be able to use a choice list. but again - without seeming any code, not exactly sure:

df['col1']=np.select([df.col1 > df.col2, df.col1 <= df.col2 ], [1, df.col1])