Finding the Most Efficient Way to Reduce a Fraction-CodePudding

As the title suggests I've been trying to find most efficient way to reduce a fraction (i.e. 10/20 -> 1/2) and I came up with this.

def recursive(n, d):
    i=1
    loop = True

    while loop == True:
        i  = 1
        if n % i == 0:
            if d % i == 0:
                loop = False

        if i > min(n, d):
            return f'{n} / {d}'

    nn = n // i
    nd = d // i
    return recursive(nn, nd), f'{n}/{d} Common D: {i}'

My thought process was in the worst case scenario this will count up to the minimum of the numerator, denominator pair - n/d doing (N) computations in the process. While if the fraction had something to reduce in the first place it would do so. And in doing so, it'd reduce (N), i.e. how much it needs to count in the first place.

For example given a random fraction like 206/202, it would immediately divide it in two and call the function again, with the new call only having to count to 101, therefore doing about (N/2) computations.

Wanting to test this, I made a control function that always counts to the minimum of the n/d pair (thus always doing N computations), as follows:

def big_O(n, d):
    list_ = [f'{n} / {d} OG']
    for iteration in range(2 ,min(n, d) 1):
        if n % iteration == 0:
            if d % iteration == 0:
                list_.append(f'{n//iteration} / {d//iteration} Common D: {iteration}')
    return list_

And I timed them against each other with a janky timer I had on hand and a random number generator:

def gen():
    return randint(1,1000), randint(1,1000)

However much to my surprise my function performed significantly worse I didn't save many of the results, but here is one which is pretty indicative of the rest:

Given 100,000 samples

SIMPLE: ORDERED BY SPEED

frac big O -- 2.3903738830000023 seconds

frac recursive -- 8.184171485 seconds

This came as shock to me as I thought in the worst case scenario, due to bad luck with getting un-reducible fractions, they should be roughly equal! So I calculated the amount of times the generator would come up with a fraction that couldn't be reduced and I got something around 60% of the time, and I thought to myself,

"Ok, what if I make them all even. Then given the logic above (206/202), surely my function should be faster."

def gen_even():
    return randrange(2,2002,2), randrange(2,2002,2)

And the results are!

With 100,000 Samples

SIMPLE: ORDERED BY SPEED

frac.big_O() -- 4.2074602940000005 seconds

frac.recursive() -- 7.840177308999998 seconds

It's slightly better... BUT IT'S STILL WORSE! How can this be?! I don't get it. After this I looked at many, many different things, but in the end I couldn't figure it out. I did however find out some more fun facts, for instance once the fraction can be reduced twice (i.e. 12/8 : 6/4 : 3/2) then my function starts being better by a factor of how many more reductions there are.

For example with this generation function:

def genx4():
    return randint(1,1000)*4, randint(1,1000)*4

The results look like this:

100,000 samples

SIMPLE: ORDERED BY SPEED

frac recursive -- 6.605415995000001 seconds

frac big O -- 8.567245255000003 seconds

And if you replace the *4 with *1000 they look like this

With only 100 samples!

SIMPLE: ORDERED BY SPEED

frac recursive -- 0.014295906999997499 seconds

frac big O -- 2.250979277999999 seconds

Except why is it only good here?! I don't know, so if any of the kind people who have read up to here have anything to say, from different possible solutions regarding reducing fractions to why my recursive function is worse given the above, please go ahead, and also thank you very much for your time!

CodePudding user response：

Before you call recursive, you have tested n and d against all the divisors less than i, so you know that i is the smallest number that divides both of them. Dividing by i cannot make a new divisor spring into existence, so there are still no divisors smaller than i. Nonetheless, recursive will start again from 2, fruitlessly doing trial divisions until it again reaches i or, with luck, the smaller of n and d.

Your iterative solution does not have that particular inefficiency, so it will often be more efficient.

Obviously, you could do a lot better. For one thing, you only need to test primes; in particular, once you have established that n and d are not both even, it's pointless to try any other even trial divisor. So right there you can cut the number of trial divisions in half.

But by far the simplest approach is to use Euclid's algorithm to find the Greatest Common Divisor (GCD) between n and d. Then that's what you need to divide by. Euclid's algorithm is extremely fast (if you use the remainder operator rather than repeated subtraction), and while you're implementing it on a modern computer, you can meditate on the amazing fact that it was invented by Greek mathematicians (possibly even predating Euclid, who wrote it down) more than two thousand years ago.

CodePudding user response：

For the 'most efficient' solution, you're just looking for the GCD of n, d, so the Euclidean algorithm solves this in O(log(max(n,d)) multiplications. Unless you're a theoretician or dealing with massive numbers, it's probably not worth trying to optimize much beyond that. See, for example, the wiki on greatest common divisors for more information on GCD calculation, but to summarize, you'd have to use fast multiplication algorithms with inputs in the 'thousands of digits' range to start outperforming normal multiplication.

For the surprising timing results, it's likely due to while loop overhead compared to for-loops from Python internals. Try disassembling both functions-- the for-loop iteration is done in C, and the while loop iteration is done in Python bytecode, which is slower.

On the short time-scales you're measuring, performance can be highly unpredictable. As a result, timing many short loops might tell you more about the efficiency of the language's loop implementation than the algorithmic efficiency of your code.

The fact that you only saw results compatible with the asymptotic predictions when your inputs got large isn't all too surprising-- it's the core idea of asymptotic analysis.