Home > Blockchain >  How do I more efficiently convert a string of slices to slice objects that can then be used to slice
How do I more efficiently convert a string of slices to slice objects that can then be used to slice

Time:04-03

How can I simplify this function that converts strings of slices for PyTorch / NumPy to slice list objects that can then be used to slice arrays & tensors?

The code below works, but it seems rather inefficient in terms of how many lines it takes.

def str_to_slice_indices(slicing_str: str):
    # Convert indices to lists
    indices = [
        [i if i else None for i in indice_set.strip().split(":")]
        for indice_set in slicing_str.strip("[]").split(",")
    ]

    # Handle Ellipsis "..."
    indices = [
        ... if index_slice == ["..."] else index_slice for index_slice in indices
    ]
    # Handle "None" values
    indices = [
        None if index_slice == ["None"] else index_slice for index_slice in indices
    ]
    # Handle single number values
    indices = [
        int(index_slice[0])
        if isinstance(index_slice, list)
        and len(index_slice) == 1
        and index_slice[0].lstrip("-").isdigit()
        else index_slice
        for index_slice in indices
    ]

    # Create indice slicing list
    indices = [
        slice(*[int(i) if i and i.lstrip("-").isdigit() else None for i in index_slice])
        if isinstance(index_slice, list)
        else index_slice
        for index_slice in indices
    ]
    return indices

Running the above function with an example covering the various types of inputs, give this:

out = str_to_slice_indices("[None, :1, 3:4, 2, :, 2:, ...]")
print(out)

# out:
# [None, slice(None, 1, None), slice(3, 4, None), 2, slice(None, None, None), slice(2, None, None), Ellipsis]

CodePudding user response:

@Michael suggested using eval on a np.s_.

Another way to demonstrate this is to define a simple class that just accepts a getitem tuple:

In [83]: class Foo():
    ...:     def __getitem__(self, arg):
    ...:         print(arg)
    ...: 
In [84]: Foo()[None, :1, 3:4, 2, :, 2:, ...]
(None, slice(None, 1, None), slice(3, 4, None), 2, slice(None, None, None), slice(2, None, None), Ellipsis)

In normal Python usage, it's the interpreter that converts the ':::' kinds of strings into slice (and related objects). And it only does so within indexing expressions. Effectively your code tries to replicate the work that the interpreter normally does.

I haven't enough attention to the eval security issues to know what you have to add. It seems that the indexing syntax is pretty restrictive as it is.

It looks like strings that don't fit the slice and ellipsis syntax are passed through unchanged and unevaluated.

In [90]: Foo()['if x is 1:print(x)']
if x is 1:print(x)

My Foo and np.s_ don't try to evaluate the tuple that __getitem__ passes to them. np.s_ is nearly as simple (the code is to find and read).

Normally ast.literal_eval is used as a 'safer' alternative to eval, but it only handles strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None

CodePudding user response:

Iterating multiple times is not necessary. The sample string has been slightly expanded to test more cases.

def str2slices(s):
    d = {True: lambda e: slice(*[int(i) if i else None for i in e.split(':')]),
        'None': lambda e: None,
        '...': lambda e: ...}
    return [d.get(':' in e or e.strip(), lambda e: int(e))(e.strip()) for e in s[1:-1].split(',')]

str2slices('[None, :1, 3:4, 2, :, -10: ,::,:4:2, 1:10:2, -32,...]')

Output

[None,
 slice(None, 1, None),
 slice(3, 4, None),
 2,
 slice(None, None, None),
 slice(-10, None, None),
 slice(None, None, None),
 slice(None, 4, 2),
 slice(1, 10, 2),
 -32,
 Ellipsis]

The same errors as in OP's solution are caught. They don't silently change the result, but throw a ValueError for unsupported input.

  • Related