I've been trying to write some code to delete rows from my 2d array according to the following criteria:
- every lone entry, so that no patient only has one entry (the mriindex ticks up by 1 for every entry of the same patient in the array)
- every entry above the 4th one.
Should either of those criteria be fulfilled, np.delete should remove the row currently being iterated through (i.e. the ith row of the index) The mriindex is the 6th column in my array.
Input for np.delete was the array arr, the row index i and the axis 0 (for row if I'm not mistaken), mapped to a new array new_arr.
As can be seen from the output though, my conditions aren't fulfilled. For example, the 4th person in the array (Alex Maier) should no longer be there (being a lone entry).
Help would be very much appreciated.
Code (very inefficient) is the following:
#remove single entries
i = 0
for i in range(n-1):
if arr[i][5] == 1:
if arr[i 1][5] == 1:
new_arr = np.delete(arr, i, axis = 0)
i = i 1
if arr[i][5]!=1:
if arr[i][5] >4:
new_arr = np.delete(arr, i , axis = 0)
i = i 1
else:
i = i 1
CodePudding user response:
Have a look at your code and check what is happening to new_arr
when the next loop itteration starts.
The approach you chose however is not very efficient, because you will copy the array every time you call delete. It is better do do it in one shot, something like this:
# creating bool arrays with the inteded logic
lessThan5 = arr[:,5]<5
singleEntry = np.diff(arr[:,5],append=[1])!=0
# using np.where to filter
keepers = np.where(np.logical_and(lessThan5, singleEntry))
# index and save to new array
filtered_arr = arr[keepers[0],:]