Gap in primes(Codewars) - out of time-CodePudding

Question: The prime numbers are not regularly spaced. For example from 2 to 3 the gap is 1. From 3 to 5 the gap is 2. From 7 to 11 it is 4. Between 2 and 50 we have the following pairs of 2-gaps primes: 3-5, 5-7, 11-13, 17-19, 29-31, 41-43

A prime gap of length n is a run of n-1 consecutive composite numbers between two successive primes (see: http://mathworld.wolfram.com/PrimeGaps.html).

We will write a function gap with parameters:

g (integer >= 2) which indicates the gap we are looking for

m (integer > 2) which gives the start of the search (m inclusive)

n (integer >= m) which gives the end of the search (n inclusive)

n won't go beyond 1100000.

In the example above gap(2, 3, 50) will return [3, 5] or (3, 5) or {3, 5} which is the first pair between 3 and 50 with a 2-gap.

So this function should return the first pair of two prime numbers spaced with a gap of g between the limits m, n if these numbers exist otherwise `nil or null or None or Nothing (or ... depending on the language).

# My attempt
def gap(g, m, n):
    prime = [True for _ in range(n 1)]
    p = 2
    
    while (p*p <= n):
        if prime[p]:
            for i in range(p*p, n 1, p):
                prime[i] = False
        p  = 1
        
    nums = [i for i, T in enumerate(prime) if T and i>m]
    
    for i in range(len(nums)-1):
        if nums[i 1] - nums[i] == g:
            return [nums[i], nums[i 1]]
        
    else: return

My function works fine in smaller numbers, but when it comes to more than 6 digits it ran out of time. Where could I improve my code?

CodePudding user response：

You could eliminate the whole second part with the nums list by keeping track of the distance between primes in your sieve of Eratosthenes logic. Also, you could initialize you primes list with multiples of two already discarded, then step by two to check primes after 2 and the rule of 6 after 5.

You can update the the multiples of primes with

primes[p*p::2*p] = [False]*len(range(p*p,len(primes),2*p))

instead of a for loop.

Here's what the optimized version could look like:

def gap(g, m, n):
    isPrime = [False,False,True] [True,False]*(n//2)
    prev,p,inc = 2,3,2    
    while p<=n:
        if isPrime[p]:
            isPrime[p*p::2*p] = [False]*len(range(p*p,len(isPrime),2*p))
            if prev>=m and p-prev == g:
                return prev,p
            prev = p
        p,inc = p inc,2 if p<5 else (6-inc)

Personally, I would approach this a little differently. I would first create a generator function that produces primes. Then I would use that generator function to get primes and track the differences from one to the next and return the first eligible pair:

def primeGen(maxPrime):   # primes generator, up to maxPrime
    skip = dict()
    yield 2
    p = 1
    while p<=maxPrime:
        p  = 2
        if p not in skip:
            yield p
            if p*p<maxPrime: skip[p*p] = 2*p
            continue
        step = skip.pop(p)
        mult = p   step
        while mult in skip: mult  = step
        skip[mult] = step

def primeGap(g,m,n):
    previous = None
    for prime in primeGen(n):
        if prime<m: continue
        if previous and prime-previous == g:
            return previous,prime
        previous = prime

Note that this is slower (2x) than the sieve when the gap is not found early (before n/4) so you should stick with the optimized gap() function if execution time is constrained

Output:

print(primeGap(2,3,50))        # (3,5)
print(primeGap(4,3,50))        # (7,11)
print(primeGap(6,3,50))        # (23,29)
print(primeGap(8,1000,2000))   # (1109, 1117)     
print(primeGap(48,1,1100000))  # (28229, 28277)    6.5 milliseconds
print(primeGap(100,1,1100000)) # (396733, 396833) 79.5 milliseconds

[EDIT] work around for timeout ...

It seems the issue is with regenerating primes on every test. If you create a list of primes in a global variable (outside the function) and use it for the gap search in the function, then you won't get the timeout.

primes   = [2]     
maxPrime = 1100000
isPrime = [False,False,True] [True,False]*(maxPrime//2)
p,inc = 3,2    
while p<=maxPrime:
    if isPrime[p]:
        primes.append(p)
        isPrime[p*p::2*p] = [False]*len(range(p*p,len(isPrime),2*p))
    p,inc = p inc,2 if p<5 else (6-inc)

def gap(g, m, n):
    prev = None
    for p in primes:
        if p>n: break
        if p<m: continue
        if prev and p-prev == g: return [prev,p]
        prev = p

CodePudding user response：

It depends what your time limit is. A simple optimization, which nevertheless yields the same computational complexity is to avoid copying the prime numbers that you found to a new vector.

Using prime, you can iterate over the array using a "window" of length g. starting from m, you check whether m is a prime and whether m g is a prime. If yes, then Bingo! Otherwise you move on to m 1.

You can see that in the following code

import time

def original_gap(g, m, n):
    prime = [True for _ in range(n   1)]
    p = 2

    while p * p <= n:
        if prime[p]:
            for i in range(p * p, n   1, p):
                prime[i] = False
        p  = 1

    nums = [i for i, T in enumerate(prime) if T and i >= m]

    for i in range(len(nums)-1):
        if nums[i   1] - nums[i] == g:
            return [nums[i], nums[i   1]]

    return

def new_gap(g, m, n):
    if g > n - m:
        return None

    prime = [True for _ in range(n   1)]
    p = 2

    while p * p <= n:
        if prime[p]:
            for i in range(p * p, n   1, p):
                prime[i] = False
        p  = 1

    for index in range(m, n - g   1):
        if prime[index] and prime[index   g]:
            return [index, index   g]

    return None

if __name__ == "__main__":
    start = time.time()
    print(new_gap(2, 3, 1100000))
    end = time.time()
    print("Time for 3-1100000 is: ", end - start)

    start = time.time()
    print(original_gap(2, 3, 1100000))
    end = time.time()
    print("Time for 3-1100000 is: ", end - start)