Home > Blockchain >  Recursion in python with list slicing vs indexes
Recursion in python with list slicing vs indexes

Time:07-22

Is there any (time/space complexity) disadvantage in writing recursive functions in python using list slicing? Form what I've seen on the internet, people tend to use lists and low/high variables in recursive functions, but for me it seems more natural to call a function recursively with sliced lists.

Here are two implementations of binary search as examples of the what I'm describing:

List slicing

def binSearch(arr,k):
    if len(arr) < 1:
        return -1
    mid = len(arr) // 2
    if arr[mid] == k:
        return mid
    elif arr[mid] < k:
        val = binSearch(arr[mid 1:],k)
        if val == -1:
            return -1
        else:
            return mid   1   val
    else:
        return binSearch(arr[:mid],k)

Indexes

def binSearch2(arr,k,low,high):
    if low > high:
        return -1
    mid = (high low) // 2
    if arr[mid] == k:
        return mid
    elif arr[mid] < k:
        return binSearch2(arr,k,mid 1,high)
    else:
        return binSearch2(arr,k,low,mid-1)

CodePudding user response:

Slices plus recursion is, in general, a double-whammy of undesirability in Python. In this case, recursion is acceptable, but the slicing isn't. If you're in a rush, scroll to the bottom of this post and look at the benchmark.

Let's talk about recursion first. Python wasn't designed to support recursion well, or at least not to the extent that functional languages that use a "natural" head/tail (car/cdr in Lisp) approximation of slicing. This generalizes to any imperative language without tail call support or first-class linked lists that allow accessing the tail in O(1).

Recursion is inappropriate for any linear algorithm in Python because the default CPython stack size is around 1000, meaning if the structure you're processing has more than 1000 elements (a very small number), your program will fail. There are dangerous hacks to increase the stack size, but this just kicks the can to other trivially small limits and risks ungraceful interpreter termination.

For a binary search, recursion is fine, because you have an O(log(n)) algorithm, so you can comfortably handle pretty much any size structure. See this answer for a deeper treatment of when recursion is and isn't appropriate in Python and why it's a second-class citizen by design. Python is not a functional language, and never will be, according to its creator.

There are also few problems that actually require recursion. In this case, Python has a builtin that should cover the rare cases where you need a binary search. For the times bisect doesn't fit your needs, writing your algorithm iteratively is arguably no less intuitive than recursion (and, I'd argue, fits more naturally into the Python iteration-first paradigm).

Moving on to slicing, although binary search is one of the rare cases where recursion is acceptable, slices are absolutely not appropriate here. Slicing the list here is an O(n) copy operation, which totally defeats the purpose of binary searching. You might as well use in, which does a linear search for the same complexity cost of a single slice. Adding slicing here makes the code easier to write, but causes the time complexity to skyrocket to O(n(log(n)).

Slicing also incurs a totally unnecessary O(n) space cost, not to mention garbage collection and memory allocation action, a potentially painful constant time cost.

Let's benchmark and see for ourselves. I used this boilerplate with one change to the namespace:

dict(arr=random.sample(range(n), k=n), k=n, low=0, high=n-1)
$ py test.py --n 1000000
------------------------------
n = 1000000, 100 trials
------------------------------

def binSearch(arr,k):

time (s) => 17.658957500000042
------------------------------

def binSearch2(arr,k,low,high):

time (s) => 0.01235299999984818
------------------------------

So for n=1000000 (not a large number at all), slicing is about 1400 times slower than indices. It just gets worse on larger numbers.


Minor nitpicks:

  • Use snake_case, not camelCase per PEP-8. Format your code with black.
  • Arrays in Python refer to something other than the type([]) => <class 'list'> you're probably using. I suggest lst or it if it's a generic list or iterable parameter.
  • Related