How can I simplify this function that converts strings of slices for PyTorch / NumPy to slice list objects that can then be used to slice arrays & tensors?
The code below works, but it seems rather inefficient in terms of how many lines it takes.
def str_to_slice_indices(slicing_str: str):
# Convert indices to lists
indices = [
[i if i else None for i in indice_set.strip().split(":")]
for indice_set in slicing_str.strip("[]").split(",")
]
# Handle Ellipsis "..."
indices = [
... if index_slice == ["..."] else index_slice for index_slice in indices
]
# Handle "None" values
indices = [
None if index_slice == ["None"] else index_slice for index_slice in indices
]
# Handle single number values
indices = [
int(index_slice[0])
if isinstance(index_slice, list)
and len(index_slice) == 1
and index_slice[0].lstrip("-").isdigit()
else index_slice
for index_slice in indices
]
# Create indice slicing list
indices = [
slice(*[int(i) if i and i.lstrip("-").isdigit() else None for i in index_slice])
if isinstance(index_slice, list)
else index_slice
for index_slice in indices
]
return indices
Running the above function with an example covering the various types of inputs, give this:
out = str_to_slice_indices("[None, :1, 3:4, 2, :, 2:, ...]")
print(out)
# out:
# [None, slice(None, 1, None), slice(3, 4, None), 2, slice(None, None, None), slice(2, None, None), Ellipsis]
CodePudding user response:
@Michael suggested using eval
on a np.s_
.
Another way to demonstrate this is to define a simple class that just accepts a getitem
tuple
:
In [83]: class Foo():
...: def __getitem__(self, arg):
...: print(arg)
...:
In [84]: Foo()[None, :1, 3:4, 2, :, 2:, ...]
(None, slice(None, 1, None), slice(3, 4, None), 2, slice(None, None, None), slice(2, None, None), Ellipsis)
In normal Python usage, it's the interpreter that converts the ':::' kinds of strings into slice
(and related objects). And it only does so within indexing expressions. Effectively your code tries to replicate the work that the interpreter normally does.
I haven't enough attention to the eval
security issues to know what you have to add. It seems that the indexing syntax is pretty restrictive as it is.
It looks like strings that don't fit the slice
and ellipsis
syntax are passed through unchanged and unevaluated.
In [90]: Foo()['if x is 1:print(x)']
if x is 1:print(x)
My Foo
and np.s_
don't try to evaluate the tuple that __getitem__
passes to them. np.s_
is nearly as simple (the code is to find and read).
Normally ast.literal_eval
is used as a 'safer' alternative to eval
, but it only handles strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None
CodePudding user response:
Iterating multiple times is not necessary. The sample string has been slightly expanded to test more cases.
def str2slices(s):
d = {True: lambda e: slice(*[int(i) if i else None for i in e.split(':')]),
'None': lambda e: None,
'...': lambda e: ...}
return [d.get(':' in e or e.strip(), lambda e: int(e))(e.strip()) for e in s[1:-1].split(',')]
str2slices('[None, :1, 3:4, 2, :, -10: ,::,:4:2, 1:10:2, -32,...]')
Output
[None,
slice(None, 1, None),
slice(3, 4, None),
2,
slice(None, None, None),
slice(-10, None, None),
slice(None, None, None),
slice(None, 4, 2),
slice(1, 10, 2),
-32,
Ellipsis]
The same errors as in OP's solution are caught. They don't silently change the result, but throw a ValueError
for unsupported input.