I'm working on Titanic dataset and after i running some algorithms i have numpy arrays of y_predictions. I want to compare them and extract only the values that equal in each array at each place. For example:
index | a | b | c | d |
---|---|---|---|---|
0 | 1 | 1 | 1 | 1 |
1 | 1 | 0 | 1 | 1 |
2 | 0 | 0 | 1 | 0 |
3 | 0 | 1 | 0 | 1 |
4 | 0 | 0 | 0 | 0 |
a,b,c and d are y_predictions of algorithms. The output should be: [1, 0, 0, 0, 1] Because at index 0 and 4 all the values are equal, so i assigned 1, otherwise 0. Basically, what i want to do, is to see the indexes (passengers) which those algorithms identify as 'Survived' which represented by 1.
There is my code:
a= [1,1,0,0,0]
b= [1,0,0,1,0]
c= [1,1,1,0,0]
d= [1,1,0,1,0]
L= [a,b,c,d]
holder = L[0]
for i in range(len(L)):
equality = np.where(holder == L[i 1], holder, 'None')
holder = equ
But i get some errors. I would appreciate any suggestions
CodePudding user response:
your L array has the wrong shape you should have the transpose of your L to get the table you have in the description and I suggest you convert it to a numpy array:
result = []
a= [1,1,0,0,0]
b= [1,0,0,1,0]
c= [1,1,1,0,0]
d= [1,1,0,1,0]
L=np.array([a,b,c,d]).T
holder = L[0]
for i in range(len(L)):
result.append(int(np.all(L[i,:] == L[i,0])))
print(result)
CodePudding user response:
Use .all
method on arr
to check if there are columns where all values are 1, then use .all
again on (arr==0)
to check if there are columns where all values are 0. The sum of these two arrays will be your desired outcome:
arr = np.array([a,b,c,d])
out = (arr.all(0) (arr==0).all(0)).astype(int)
Output:
[1, 0, 0, 0, 1]