For a numpy array
a = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8])
You can get a slice using something like a[3:6]
But what about getting the rest of the slice? What is the most computationally efficient method for this? So something like a[:3, 6:]
.
The best I can come up with is to use a concatenate.
np.concatenate([a[:3], a[6:]], axis=0)
I am wondering if this is the best method, as I will be doing millions of these operations for a data processing pipeline.
CodePudding user response:
Your solution seems to be the most efficient one since it is more than 2x faster than the next best thing.
a = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8])
%timeit -n 100000 np.concatenate([a[:3], a[6:]], axis=0)
%timeit -n 100000 np.delete(a, slice(3, 6))
%timeit -n 100000 a[np.r_[:3,6:]]
>2.03 µs ± 75.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
>4.61 µs ± 146 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
>11 µs ± 350 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
However, the real question is if these operations (complement set of slice/deletion) need to be applied consecutively. Otherwise, you could aggregate the indices via set operations and slice the compliment a single time in the end to obtain the proper NumPy array.
CodePudding user response:
I find declaring an empty array and filling it up seems to be very slightly better than using concat . As André mentioned in their comment this will vary based on the shape.
a = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8])
def testing123():
new = np.zeros(6, dtype=int)
new[:3] = a[0:3]
new[3:] = a[6:]
return new
%timeit -n 100000 np.concatenate([a[:3], a[6:]], axis=0)
100000 loops, best of 5: 2.18 µs per loop
%timeit -n 100000 np.delete(a, slice(3, 6))
100000 loops, best of 5: 6.11 µs per loop
%timeit -n 100000 a[np.r_[:3,6:]]
100000 loops, best of 5: 16.4 µs per loop
%timeit -n 100000 testing123()
100000 loops, best of 5: 2.01 µs per loop
a = np.arange(10_000)
def testing123():
new = np.empty(5000, dtype=int)
new[:2500] = a[:2500]
new[2500:] = a[7500:]
return new
%timeit -n 100000 np.concatenate([a[:2500], a[7500:]], axis=0)
100000 loops, best of 5: 3.99 µs per loop
%timeit -n 100000 np.delete(a, slice(2500, 7500))
100000 loops, best of 5: 7.76 µs per loop
%timeit -n 100000 a[np.r_[:2500,7500:]]
100000 loops, best of 5: 47.3 µs per loop
%timeit -n 100000 testing123()
100000 loops, best of 5: 3.61 µs per loop