What is the pythonic way of "iterating" over a single item?-CodePudding

I come across this issue often, and I would be surprised if there wasn't some very simple and pythonic one-liner solution to it.

Suppose I have a method or a function that takes a list or some other iterable object as an argument. I want for an operation to be performed once for each item in the object.

Sometimes, only a single item (say, a float value) is passed to this function. In this situation, my for-loop doesn't know what to do. And so, I find myself peppering my code with the following snippet of code:

from collections.abc import Sequence

def my_function(value):
   if not isinstance(value, Sequence):
      value = [value]

   # rest of my function

This works, but it seems wasteful and not particularly legible. In searching StackOverflow I've also discovered that strings are considered sequences, and so this code could easily break given the wrong argument. It just doesn't feel like the right approach.

I come from a MATLAB background, and this is neatly solved in that language since scalars are treated like 1x1 matrices. I'd expect, at the very least, for there to be a built-in, something like numpy's atleast_1d function, that automatically converts anything into an iterable if it isn't one.

CodePudding user response：

The short answer is nope, there is no simple built-in. And yep, if you want str (or bytes or bytes-like stuff or whatever) to act as a scalar value, it gets uglier. Python expects callers to adhere to the interface contract; if you say you accept sequences, say so, and it's on the caller to wrap any individual arguments.

If you must do this, there's two obvious ways to do it:

First is to make your function accept varargs instead of a single argument, and leave it up to the caller to unpack any sequences, so you can always iterate the varargs received:

def my_function(*values):
    for val in values:
        # Rest of function

A caller with individual items calls you with my_function(a, b), a caller with a sequence calls you with my_function(*seq). The latter does incur some overhead to unpack the sequence to a new tuple to be received by my_function, but in many cases this is fine.

If that's not acceptable for whatever reason, the other solution is to roll your own "ensure iterable" converter function, following whatever rules you care about:

from collections.abc import ByteString

def ensure_iterable(obj):
    if isinstance(obj, (str, ByteString)):
        return (obj,)  # Treat strings and bytes-like stuff as scalars and wrap
    try:
        iter(obj)  # Simplest way to test if something is iterable is to try to make it an iterator
    except TypeError:
        return (obj,)  # Not iterable, wrap
    else:
        return obj  # Already iterable

which my_function can use with:

def my_function(value):
   value = ensure_iterable(value)

CodePudding user response：

I would duck type:

def first(item):
    try:
        it=iter(item)
    except TypeError:
        it=iter([item])
    return next(it)

Test it:

tests=[[1,2,3],'abc',1,1.23]

for e in tests:
    print(e, first(e))

Prints:

[1, 2, 3] 1
abc a
1 1
1.23 1.23

CodePudding user response：

Python is a general purpose language, with true scalars, and as well as iterables like lists.

MATLAB does not have true scalars. The base object is a 2d matrix. It did not start as a general purpose language.

numpy adds MATLAB like arrays to Python, but it too can have 0d arrays (scalar arrays), which may give the wayward MATLAB users headaches.

Many numpy functions have a provision for converting their input to an array. That way they will work a list input as well as array

In [10]: x = np.array(3)
In [11]: x
Out[11]: array(3)
In [12]: x.shape
Out[12]: ()
In [13]: for i in x: print(x)
Traceback (most recent call last):
  Input In [13] in <cell line: 1>
    for i in x: print(x)
TypeError: iteration over a 0-d array

It also has utility functions that insure the array is 1d, or 2 ...

In [14]: x = np.atleast_1d(1)
In [15]: x
Out[15]: array([1])
In [16]: for i in x: print(i)
1

But like old-fashion MATLAB, we prefer to avoid iteration in numpy. It doesn't have jit compilation that lets current MATLAB users get by with iterations. Technically numpy functions do use iteration, but it usually in compiled code.

np.sin applied to various inputs:

In [17]: np.sin(1)          # scalar
Out[17]: 0.8414709848078965
In [18]: np.sin([1,2,3])    # list
Out[18]: array([0.84147098, 0.90929743, 0.14112001])
In [19]: np.sin(np.array([1,2,3]).reshape(3,1))
Out[19]: 
array([[0.84147098],
       [0.90929743],
       [0.14112001]])

Technically, the [17] result is a numpy scalar, not a base python float:

In [20]: type(Out[17])
Out[20]: numpy.float64