Home > Back-end >  Set multiple rows slices to value, with a unique slice for each row
Set multiple rows slices to value, with a unique slice for each row

Time:03-03

Given a 2d array, I can set a row slice to a particular value

import numpy as np
a = np.zeros(25).reshape(5,-1).astype(int)
a[0][2:4] = 1.0
a
array([[0, 0, 1, 1, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

I am trying to set multiple row slices to a particular value, with a unique slice for each row.

I have the start and end indices for the slices in two arrays

starts = np.array([2, 0, 1, 3, 2])
ends = np.array([5, 3, 4, 5, 4])

But I can't seem to figure out a way to set these slices for the 2d array to a particular value

a[starts:ends] = 1

Results in TypeError: only integer scalar arrays can be converted to a scalar index

CodePudding user response:

If the last dimension of the target array is big, then using a basic Python loop is relatively efficient because the overhead of the python loop will be small compared to filling the array. Otherwise, AFAIK Numpy does not provide any way to do this operation efficiently (mainly because of the variable-size of the slices). Here is a basic code with Python loops:

for i, start, end in zip(range(starts.size), starts.tolist(), ends.tolist()):
    a[i, start:end] = 1

If you want a faster code then you can use Numba so to make the loop faster. Note that you do not need to call tolist in that case (its purpose is to make the code faster by not working with Numpy integer types but CPython integers).

CodePudding user response:

Numpy has a function that allows you apply operations to arrays, along a particular axis individually, using a function. So in my case, I can apply the operation uniquely to each row.

apply_along_axis doesn't allow arguments to be passed to the function except for the array itself, so I first concatenate the start and end indices to my zeros array, and then slice them out of the result.

import numpy as np
a = np.zeros(25).reshape(5,-1).astype(int)

starts = np.array([2, 0, 1, 3, 2])
ends = np.array([5, 3, 4, 5, 4])

startsT = np.expand_dims(starts, axis=0).transpose()
endsT = np.expand_dims(ends, axis=0).transpose()

aa = np.concatenate((a, startsT, endsT), axis=1)

def set_1s_by_slice(x):
    x[x[-2]:x[-1]] = 1
    return x

pen = np.apply_along_axis(set_1s_by_slice, 1, aa)
ult = pen[:,0:5]
ult
array([[0, 0, 1, 1, 1],
       [1, 1, 1, 0, 0],
       [0, 1, 1, 1, 0],
       [0, 0, 0, 1, 1],
       [0, 0, 1, 1, 0]])

From looking at the source code, this may not be faster that iterating through the rows

https://github.com/numpy/numpy/blob/v1.22.0/numpy/lib/shape_base.py#L267-L414

It seems that there are conversions to lists, though I am not certain.

  • Related