Question: The prime numbers are not regularly spaced. For example from 2 to 3 the gap is 1. From 3 to 5 the gap is 2. From 7 to 11 it is 4. Between 2 and 50 we have the following pairs of 2-gaps primes: 3-5, 5-7, 11-13, 17-19, 29-31, 41-43
A prime gap of length n is a run of n-1 consecutive composite numbers between two successive primes (see: http://mathworld.wolfram.com/PrimeGaps.html).
We will write a function gap with parameters:
g (integer >= 2) which indicates the gap we are looking for
m (integer > 2) which gives the start of the search (m inclusive)
n (integer >= m) which gives the end of the search (n inclusive)
n won't go beyond 1100000.
In the example above gap(2, 3, 50) will return [3, 5] or (3, 5) or {3, 5} which is the first pair between 3 and 50 with a 2-gap.
So this function should return the first pair of two prime numbers spaced with a gap of g between the limits m, n if these numbers exist otherwise `nil or null or None or Nothing (or ... depending on the language).
# My attempt
def gap(g, m, n):
prime = [True for _ in range(n 1)]
p = 2
while (p*p <= n):
if prime[p]:
for i in range(p*p, n 1, p):
prime[i] = False
p = 1
nums = [i for i, T in enumerate(prime) if T and i>m]
for i in range(len(nums)-1):
if nums[i 1] - nums[i] == g:
return [nums[i], nums[i 1]]
else: return
My function works fine in smaller numbers, but when it comes to more than 6 digits it ran out of time. Where could I improve my code?
CodePudding user response:
You could eliminate the whole second part with the nums list by keeping track of the distance between primes in your sieve of Eratosthenes logic. Also, you could initialize you primes list with multiples of two already discarded, then step by two to check primes after 2 and the rule of 6 after 5.
You can update the the multiples of primes with
primes[p*p::2*p] = [False]*len(range(p*p,len(primes),2*p))
instead of a for loop.
Here's what the optimized version could look like:
def gap(g, m, n):
isPrime = [False,False,True] [True,False]*(n//2)
prev,p,inc = 2,3,2
while p<=n:
if isPrime[p]:
isPrime[p*p::2*p] = [False]*len(range(p*p,len(isPrime),2*p))
if prev>=m and p-prev == g:
return prev,p
prev = p
p,inc = p inc,2 if p<5 else (6-inc)
Personally, I would approach this a little differently. I would first create a generator function that produces primes. Then I would use that generator function to get primes and track the differences from one to the next and return the first eligible pair:
def primeGen(maxPrime): # primes generator, up to maxPrime
skip = dict()
yield 2
p = 1
while p<=maxPrime:
p = 2
if p not in skip:
yield p
if p*p<maxPrime: skip[p*p] = 2*p
continue
step = skip.pop(p)
mult = p step
while mult in skip: mult = step
skip[mult] = step
def primeGap(g,m,n):
previous = None
for prime in primeGen(n):
if prime<m: continue
if previous and prime-previous == g:
return previous,prime
previous = prime
Note that this is slower (2x) than the sieve when the gap is not found early (before n/4) so you should stick with the optimized gap() function if execution time is constrained
Output:
print(primeGap(2,3,50)) # (3,5)
print(primeGap(4,3,50)) # (7,11)
print(primeGap(6,3,50)) # (23,29)
print(primeGap(8,1000,2000)) # (1109, 1117)
print(primeGap(48,1,1100000)) # (28229, 28277) 6.5 milliseconds
print(primeGap(100,1,1100000)) # (396733, 396833) 79.5 milliseconds
[EDIT] work around for timeout ...
It seems the issue is with regenerating primes on every test. If you create a list of primes in a global variable (outside the function) and use it for the gap search in the function, then you won't get the timeout.
primes = [2]
maxPrime = 1100000
isPrime = [False,False,True] [True,False]*(maxPrime//2)
p,inc = 3,2
while p<=maxPrime:
if isPrime[p]:
primes.append(p)
isPrime[p*p::2*p] = [False]*len(range(p*p,len(isPrime),2*p))
p,inc = p inc,2 if p<5 else (6-inc)
def gap(g, m, n):
prev = None
for p in primes:
if p>n: break
if p<m: continue
if prev and p-prev == g: return [prev,p]
prev = p
CodePudding user response:
It depends what your time limit is. A simple optimization, which nevertheless yields the same computational complexity is to avoid copying the prime numbers that you found to a new vector.
Using prime
, you can iterate over the array using a "window" of length g
. starting from m
, you check whether m is a prime and whether m g
is a prime. If yes, then Bingo! Otherwise you move on to m 1
.
You can see that in the following code
import time
def original_gap(g, m, n):
prime = [True for _ in range(n 1)]
p = 2
while p * p <= n:
if prime[p]:
for i in range(p * p, n 1, p):
prime[i] = False
p = 1
nums = [i for i, T in enumerate(prime) if T and i >= m]
for i in range(len(nums)-1):
if nums[i 1] - nums[i] == g:
return [nums[i], nums[i 1]]
return
def new_gap(g, m, n):
if g > n - m:
return None
prime = [True for _ in range(n 1)]
p = 2
while p * p <= n:
if prime[p]:
for i in range(p * p, n 1, p):
prime[i] = False
p = 1
for index in range(m, n - g 1):
if prime[index] and prime[index g]:
return [index, index g]
return None
if __name__ == "__main__":
start = time.time()
print(new_gap(2, 3, 1100000))
end = time.time()
print("Time for 3-1100000 is: ", end - start)
start = time.time()
print(original_gap(2, 3, 1100000))
end = time.time()
print("Time for 3-1100000 is: ", end - start)