Given an (N, 2)
array of [start, stop]
values, I want an (M,)
array consisting of [np.arange(start[0], stop[0]), np.arange(start[1], stop[1], ..., np.arange(start[N - 1], stop[N - 1])]
.
For example, given arr = np.array([[1, 3], [4, 5], [7, 8], [8, 10]])
, I want [1, 2, 4, 7, 8, 9]
.
One method is with a for loop:
arr = np.array([[1, 3], [4, 5], [7, 8], [8, 10]])
out = np.zeros(0, dtype=int)
for i in range(len(arr)):
out = np.append(out, np.arange(arr[i, 0], arr[i, 1]))
print(out)
Result: [1 2 4 7 8 9]
.
Is there a more efficient and/or readable way of doing this?
CodePudding user response:
Using a list comprehension to collect the aranges, and then join them:
In [214]: alist = [[1, 3], [4, 5], [7, 8], [8, 10]]
In [215]: [np.arange(start,stop) for start,stop in alist]
Out[215]: [array([1, 2]), array([4]), array([7]), array([8, 9])]
In [216]: np.hstack(_)
Out[216]: array([1, 2, 4, 7, 8, 9])
There's a version of hstack
that uses indexing syntax:
In [217]: np.r_[1:3, 4:5, 7:8, 8:10]
Out[217]: array([1, 2, 4, 7, 8, 9])
It translates the slices into arange
calls. It's compact, but it's not easy to use with a predefined list.
With the predefined list, first create a tuple:
In [225]: tup = tuple([slice(start,stop) for start, stop in alist])
In [226]: tup
Out[226]: (slice(1, 3, None), slice(4, 5, None), slice(7, 8, None), slice(8, 10, None))
In [227]: np.r_[tup]
Out[227]: array([1, 2, 4, 7, 8, 9])
Anyways, there isn't a way of generating multiple aranges with one call. If the aranges are all the same length it is possible to use np.linspace
, but not when they very in length.
We try to avoid repeated np.append
. It is inferior to list append.
CodePudding user response:
Here the line you need.
[item for start, stop in [[1, 3], [4, 5], [7, 8], [8, 10]] for item in range(start, stop)]
You can deal with normal list to achieve this. (Don t overuse numpy, python vanilla is good too!)
For small amount of data, the numpy import don't worth it!
Python vanilla time for this problem: 2.5987625122070312e-05 s
Your method time for this problem: 0.09107470512390137 s
Same without numpy import: 0.00015807151794433594 s