a = [1, 3, 6, -2, 4, 5, 8, -3, 9,
2, -5, -7, -9, 3, 6, -7, -6, 2]
I want to do like:
a = [1, 3, 6, 4, 5, 8, 9, 2,
-5, -7, -9, 3, 6, -7, -6, 2]
which deletes only 4th and 8th elements, which are single negative elements between two positive elements.
import numpy as np
a = [1, 3, 6, -2, 4, 5, 8, -3, 9,
2, -5, -7, -9, 3, 6, -7, -6, 2]
for i in range(len(a)):
if a[i] < 0 and a[i - 1] > 0 and a[i 1] > 0:
np.delete(a[i])
print(a)
This did not work. Can I know where I have to fix?
CodePudding user response:
Because you ask about numpy in the subject line and also attempt to use np.delete() in your code, I assume you intend for a
to be a numpy array.
Here is a way to do what your question asks using vectorized operations in numpy:
import numpy as np
a = np.array([1,3,6,-2,4,5,8,-3,9,2,-5,-7,-9, 3, 6, -7, -6, 2])
b = np.concatenate([a[1:], [np.NaN]])
c = np.concatenate([[np.NaN], a[:-1]])
d = (a<0)&(b>0)&(c>0)
print(a[~d])
Output:
[ 1 3 6 4 5 8 9 2 -5 -7 -9 3 6 -7 -6 2]
What we've done is to shift a
one to the left with NaN fill on the right (b
) and one to the right with NaN fill on the left (c
), then to create a boolean mask d
using vectorized compare and boolean operators <
, >
and &
which is True only where we want to delete single negative values sandwiched between positives. Finally, we use the ~
operator to flip the boolean value of the mask and use it to filter out the unneeded negative values in a
.
UPDATE: Here are some timeit()
comparisons of several variations on answers given for this question using NumPy 1.22.2.
The fastest of the 8 strategies is:
a = np.concatenate([a[:1], a[1:-1][(a[1:-1]>=0)|(a[2:]<=0)|(a[:-2]<=0)], a[-1:]])
A close second is:
a = a[np.concatenate([[True], ~((a[1:-1]<0)&(a[2:]>0)&(a[:-2]>0)), [True]])]
The strategies using np.r_()
, either with np.delete()
or with a boolean mask and []
syntax, are about twice as slow as the fastest.
The strategy using numpy.roll()
is about 3 times as slow as the fastest. Note: As highlighted by in a comment by @Kelly Bundy, the roll()
strategy in the benchmark does not give a correct answer to this question in all cases (though for the particular input example it happens to). I have nevertheless included it in the benchmark because the performance of roll()
relative to concatenate()
and r_()
may be of general interest beyond the narrow context of this question.
Results:
foo_1 output:
[ 1 3 6 4 5 8 9 2 -5 -7 -9 3 6 -7 -6 2]
foo_2 output:
[ 1 3 6 4 5 8 9 2 -5 -7 -9 3 6 -7 -6 2]
foo_3 output:
[ 1 3 6 4 5 8 9 2 -5 -7 -9 3 6 -7 -6 2]
foo_4 output:
[ 1 3 6 4 5 8 9 2 -5 -7 -9 3 6 -7 -6 2]
foo_5 output:
[ 1 3 6 4 5 8 9 2 -5 -7 -9 3 6 -7 -6 2]
foo_6 output:
[ 1 3 6 4 5 8 9 2 -5 -7 -9 3 6 -7 -6 2]
foo_7 output:
[ 1 3 6 4 5 8 9 2 -5 -7 -9 3 6 -7 -6 2]
foo_8 output:
[ 1 3 6 4 5 8 9 2 -5 -7 -9 3 6 -7 -6 2]
Timeit results:
foo_1 ran in 1.2354546000715346e-05 seconds using 100000 iterations
foo_2 ran in 1.0962473000399769e-05 seconds using 100000 iterations
foo_3 ran in 7.733614000026136e-06 seconds using 100000 iterations
foo_4 ran in 7.751871000509709e-06 seconds using 100000 iterations
foo_5 ran in 5.856722998432815e-06 seconds using 100000 iterations
foo_6 ran in 7.5727709988132115e-06 seconds using 100000 iterations
foo_7 ran in 1.7790602000895887e-05 seconds using 100000 iterations
foo_8 ran in 5.435103999916464e-06 seconds using 100000 iterations
Code that generated the results:
import numpy as np
a = np.array([1,3,6,-2,4,5,8,-3,9,2,-5,-7,-9, 3, 6, -7, -6, 2])
from timeit import timeit
def foo_1(a):
a = a if a.shape[0] < 2 else np.delete(a, np.r_[False, (a[1:-1] < 0) & (a[:-2] > 0) & (a[2:] > 0), False])
return a
def foo_2(a):
a = a if a.shape[0] < 2 else a[np.r_[True, ~((a[1:-1] < 0) & (a[:-2] > 0) & (a[2:] > 0)), True]]
return a
def foo_3(a):
b = np.concatenate([a[1:], [np.NaN]])
c = np.concatenate([[np.NaN], a[:-1]])
d = (a<0)&(b>0)&(c>0)
a = a[~d]
return a
def foo_4(a):
a = a[~((a<0)&(np.concatenate([a[1:], [np.NaN]])>0)&(np.concatenate([[np.NaN], a[:-1]])>0))]
return a
def foo_5(a):
a = a if a.shape[0] < 2 else a[np.concatenate([[True], ~((a[1:-1]<0)&(a[2:]>0)&(a[:-2]>0)), [True]])]
return a
def foo_6(a):
a = a if a.shape[0] < 2 else np.delete(a, np.concatenate([[False], (a[1:-1]<0)&(a[2:]>0)&(a[:-2]>0), [False]]))
return a
def foo_7(a):
mask_bad = (
(a < 0) & # the value is < 0 AND
(np.roll(a,1) >= 0) & # the value to the right is >= 0
(np.roll(a,-1) >= 0) # the value to the left is >= 0
)
mask_good = ~mask_bad
a = a[mask_good]
return a
def foo_8(a):
a = np.concatenate([a[:1], a[1:-1][(a[1:-1]>=0)|(a[2:]<=0)|(a[:-2]<=0)], a[-1:]])
return a
foo_count = 8
for foo in ['foo_' str(i 1) for i in range(foo_count)]:
print(f'{foo} output:')
print(eval(f"{foo}(a)"))
n = 100000
print(f'Timeit results:')
for foo in ['foo_' str(i 1) for i in range(foo_count)]:
t = timeit(f"{foo}(a)", setup=f"from __main__ import a, {foo}", number=n) / n
print(f'{foo} ran in {t} seconds using {n} iterations')
CodePudding user response:
Your conditional logic
if a[i] < 0 and a[i - 1] > 0 and a[i 1] > 0
seems sound and readable to me. But it would have issues with the boundary cases:
[1, 2, -3] -> IndexError: list index out of range
[-1, 2, 3] -> [2, 3]
Handling it properly could be as simple as skipping the first and last element of you list with
for i in range(1, len(a) - 1)
Test
import numpy as np
def del_neg_between_pos(a):
delete_idx = []
for i in range(1, len(a) - 1):
if a[i] < 0 and a[i - 1] > 0 and a[i 1] > 0:
delete_idx.append(i)
return np.delete(a, delete_idx)
if __name__ == "__main__":
a1 = [1, 3, 6, -2, 4, 5, 8, -3, 9, 2, -5, -7, -9, 3, 6, -7, -6, 2]
a2 = [1, 2, -3]
a3 = [-1, 2, 3]
for a in [a1, a2, a3]:
print(del_neg_between_pos(a))
Output
[ 1 3 6 4 5 8 9 2 -5 -7 -9 3 6 -7 -6 2]
[ 1 2 -3]
[-1 2 3]
CodePudding user response:
A solution that handles edges correctly and doesn't create an unholy number of temporary arrays:
a = np.delete(a, np.r_[False, (a[1:-1] < 0) & (a[:-2] > 0) & (a[2:] > 0), False])
Alternatively, you can create the positive rather than the negative mask
a = a[np.r_[True, (a[1:-1] >= 0) | (a[:-2] <= 0) | (a[2:] <= 0), True]]
Since np.concatenate
is faster than np.r_
, you could rephrase the masks as
np.concatenate(([False], (a[1:-1] < 0) & (a[:-2] > 0) & (a[2:] > 0), [False])
and
np.concatenate(([True], (a[1:-1] >= 0) | (a[:-2] <= 0) | (a[2:] <= 0), [True]))
In some cases, you might get extra mileage out of applying np.where(...)[0]
or np.flatnonzero
to the mask. This works sometimes because it avoids having to recompute the size of the number of masked elements twice.
CodePudding user response:
a = numpy.array([1,3,6,-2,4,5,8,-3,9,2,-5,-7,-9, 3, 6, -7, -6, 2])
mask_bad = (
(a < 0) & # the value is < 0 AND
(numpy.roll(a,1) >= 0) & # the value to the right is >= 0
(numpy.roll(a,-1) >= 0) # the value to the left is >= 0
)
mask_good = ~mask_bad
print(a[mask_good])
is one way you might do this (although it probably has an issue or two at the edges)
CodePudding user response:
Here is a simple one-liner:
import numpy as np
a = np.array([1, 3, 6, -2, 4, 5, 8, -3, 9, 2, -5, -7, -9, 3, 6, -7, -6, 2])
n = np.where((a[1: -1] < 0) & (a[2:] > 0) & (a[:-2] > 0))[0] 1
print(n)
Output:
[3 7]