Title basically, can we design a program that inputs another code file, say for example a Python program, and can tell you its time complexity?
The program can read the program word for word and indent for indent, and can count how many for
or while
statements it encounters. It can then see if they are nested for quadratic time. I feel like it's not like the Halting problem, since we're not looking to see if it will end, just its time complexity. But, what about algorithms that implement recursion? Would such a program be possible to write still?
Sorry if this seems like a silly questions, I was pondering this and was thinking of trying to write it myself.
CodePudding user response:
Mostly not.
"Doesn't end" is just another time complexity. And therefore you can't write a program that determines how quickly things halt without determining first whether they halt at all.
Furthermore it doesn't take long to get into very hard problems. For example consider the following function:
def hailstone(n):
answer = 0
while 1 < n:
answer = 1
if 0 == n % 2:
n = n // 2
else:
n = 3*n 1
return answer
Just a while and an if. But if you can tell me that this runs in time O(n)
, then you've just solved the Collatz Problem.
That said, it IS possible to produce an upper bound for some useful code. However said upper bound has to be infinity for a lot of code.
CodePudding user response:
you can do this by measuring runtimes for datasets of different sizes and then fit the complexity like this:
however you still need to create meaningful input data in order this to work reliably, also you need precise enough time measurements , and also do not forget to flush CACHEs in case of reusing the same input data...
So you should pay attention to:
- dataset size
- dataset content
- measured times should be at least ~100 ms
I recommend to see these:
also algorithms follow complexity curve usually only after some threshold dataset size so you might want to add some kind of detection of that ... and then use only bigger datasets.
However note that obtained complexity is not really the raw complexity of used algorithm but a coumulation of both algorithm and used computing architecture features so the results might differ from what you expect.