Most computationally efficient method to get the rest of the array of a slice in numpy array?-CodePudding

For a numpy array

a = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8])

You can get a slice using something like a[3:6]

But what about getting the rest of the slice? What is the most computationally efficient method for this? So something like a[:3, 6:].

The best I can come up with is to use a concatenate.

np.concatenate([a[:3], a[6:]], axis=0)

I am wondering if this is the best method, as I will be doing millions of these operations for a data processing pipeline.

CodePudding user response：

Your solution seems to be the most efficient one since it is more than 2x faster than the next best thing.

a = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8])

%timeit -n 100000 np.concatenate([a[:3], a[6:]], axis=0)

%timeit -n 100000 np.delete(a, slice(3, 6))

%timeit -n 100000 a[np.r_[:3,6:]]

>2.03 µs ± 75.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
>4.61 µs ± 146 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
>11 µs ± 350 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

However, the real question is if these operations (complement set of slice/deletion) need to be applied consecutively. Otherwise, you could aggregate the indices via set operations and slice the compliment a single time in the end to obtain the proper NumPy array.

CodePudding user response：

I find declaring an empty array and filling it up seems to be very slightly better than using concat . As André mentioned in their comment this will vary based on the shape.

a = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8])

def testing123():
    new = np.zeros(6, dtype=int)
    new[:3] = a[0:3]
    new[3:] = a[6:]
    return new

%timeit -n 100000 np.concatenate([a[:3], a[6:]], axis=0)

100000 loops, best of 5: 2.18 µs per loop

%timeit -n 100000 np.delete(a, slice(3, 6))

100000 loops, best of 5: 6.11 µs per loop

%timeit -n 100000 a[np.r_[:3,6:]]

100000 loops, best of 5: 16.4 µs per loop

%timeit -n 100000 testing123()

100000 loops, best of 5: 2.01 µs per loop

a = np.arange(10_000)

def testing123():
    new = np.empty(5000, dtype=int)
    new[:2500] = a[:2500]
    new[2500:] = a[7500:]
    return new

%timeit -n 100000 np.concatenate([a[:2500], a[7500:]], axis=0)

100000 loops, best of 5: 3.99 µs per loop

%timeit -n 100000 np.delete(a, slice(2500, 7500))

100000 loops, best of 5: 7.76 µs per loop

%timeit -n 100000 a[np.r_[:2500,7500:]]

100000 loops, best of 5: 47.3 µs per loop

%timeit -n 100000 testing123()

100000 loops, best of 5: 3.61 µs per loop