Home > Mobile >  Question regarding tail call optimization
Question regarding tail call optimization

Time:11-29

As far as I know, there is a prerequisite for performing tail call optimization is that the recursion point should be the last sentence in the function, and the result of the recursive call should be returned immediately. But why?

Here is a valid example for TCO:

int factorial(int num) {
  if (num == 1 || num == 0)
    return 1;
  return num * factorial(num - 1);
}

So, with the rule, can the below code be optimized too? Why not?

#include <stdio.h>
int factorial(int num) {
  if (num == 1 || num == 0)
    return 1;
  int temp = num * factorial(num - 1);
  printf("%d", temp);
  return temp;
}

I want to know how should I explain to others why the above rule is necessary for having a TCO. But not just simply follow.

CodePudding user response:

the result of the recursive call should be returned immediately. But why?

That's because in order to optimize a tail call you need to convert the final recursive call into a simple "jump" instruction. When you do this, you are merely "replacing" function arguments and then re-starting the function.

This is only possible if you can "throw away" the current stack frame and re-use it for the same function again, possibly overwriting it. If you need to remember a value to do more calculations and then return, you cannot use the same stack frame for the recursive call (i.e. cannot turn the "call" into a "jump"), as it could possibly erase/modify the value you wanted to remember before returning.

Furthermore, if your function is very simple (like yours) chances are that it could be written without using the stack at all (except for the return address maybe), and only store data in registers. Even in this case, you don't want to make a jump to the same function (that uses the same registers) if you need to remember one of the values before returning.

Here is a valid example for TCO:

int factorial(int num) {
  if (num == 1 || num == 0)
    return 1;
  return num * factorial(num - 1);
}

This is not valid TCO! You are doing return num * <recursive-call>. The recursive call is not the last thing that the function does, there is a multiplication before returning. It's the same as writing:

int factorial(int num) {
    if (num == 1 || num == 0)
        return 1;
    int tmp = factorial(num - 1);
    tmp *= num;
    return tmp;
}

can the below code be optimized too?

Nope! Again there simply is no tail call there, and it's even more obvious. You are first doing the recursive call, then some other stuff (multiplication and printf), and then returning. This cannot be optimized as a tail call by the compiler.

On the other hand, the following code can be optimized as a tail call:

int factorial(int n, int x) {
    if (n == 1)
        return x;
    int tmp = factorial(n - 1, n * x);
    return tmp;
}

You don't necessarily have to make the recursive call right on the last line of the function. The important thing is that you don't do work in the middle (between the recursive call and the return statement), like for example calling other functions or doing additional calculations.


IMPORTANT: note that just the fact that a classical TCO cannot be performed does not mean that the compiler will not be able to optimize your code in some other way. In fact, your first function is so simple that when compiled with GCC on x86-64 with at least -O2 it just gets converted from recursive to iterative (it basically becomes a single loop). The same goes for my example above, the compiler just doesn't care to do TCO, it sees an even better optimization to make in this case.

Here's the assembler dump of your first function generated by GCC 11 on x86-64 (Godbolt link if you want to play with it). In case you are not familiar with x86: the num argument is in edi, and eax is used for the return value.

factorial:
        mov     eax, 1
        cmp     edi, 1
        jbe     .L1
.L2:
        mov     edx, edi
        sub     edi, 1
        imul    eax, edx
        cmp     edi, 1
        jne     .L2
.L1:
        ret

CodePudding user response:

Each invocation of a function creates a stack frame with any data passed into that function via arguments. If a function calls another function (including itself) a new stack frame is pushed onto the stack. When a function is completely finished, its frame is popped off the stack.

Stack memory is limited. If we try to push too many frames onto the stack, we get a stack overflow error.

Where tail call optimization comes into play is to recognize that a function is complete if there is no work left to be done after the tail call.

Consider a way of recursively summing a range of numbers.

int sum(int start, int stop) {
    if (start == stop) {
        return start;
    }
    else {
        return start   sum(start   1, stop);
    }
}

If we call sum(1, 5) the recursion looks something like:

sum(1, 5)
1   sum(2, 5)
1   2   sum(3, 5)
1   2   3   sum(4, 5)
1   2   3   4   sum(5, 5)
1   2   3   4   5

Several stack frames have to be created to hold this.

Typically tail-call optimization for something that requires building up a value involves an accumulator argument passed to the function.

int sum_tco(int start, int stop, int acc) {
    if (start == stop) {
        return start   acc;
    }
    else {
        return sum_tco(start   1, stop, start   acc);
    }
}

Now consider what the recursion looks like:

sum_tco(1, 5, 0)
sum_tco(2, 5, 1   0)
sum_tco(3, 5, 2   1   0)
sum_tco(4, 5, 3   2   0)
sum_tco(5, 5, 5   4   3   2   1   0)
5   4   3   2   1   0

We don't need to know what the result of sum(1, 5, 0) or sum(3, 5, 2 1 0) is to know what the result of sum(5, 5, 5 4 3 2 1 0) is, and neither does your computer.

A smart compiler realizes this and removes all of those previous stack frames as it goes. With TCO, no matter how many times this function recursively calls itself, it will never overflow the stack.

(Descriptions of how the stack behaves have been generalized and are not intended to be technically in-depth but rather to demonstrate the generalized concept of TCO.)

  • Related