Postponed Sieve algorithm with start logic-CodePudding

Based on the answer by Will Ness, I've been using a JavaScript adaptation for the postponed sieve algorithm:

function * primes() {
    yield 2;
    yield 3;
    yield 5;
    yield 7;
    const sieve = new Map();
    const ps = primes();
    ps.next() && ps.next();
    for (let p = 3, i = 9; true; i  = 2) {
        let s = sieve.get(i);
        if (s !== undefined) {
            sieve.delete(i);
        } else if (i < p * p) {
            yield i;
            continue;
        } else {
            s = 2 * p;
            p = ps.next().value;
        }
        let k = i   s;
        while (sieve.has(k)) k  = s;
        sieve.set(k, s);
    }
}

But now that I need to add start point to it, I am struggling to wrap my head around it, for the logic here isn't straightforward.

When start is a prime, I need it to be the first value. And when start is not a prime, I need the sequence to start with the first prime after start.

Will Ness suggested in one of the comments:

You would have to come up with the valid sieve dictionary for the start point. It's easy to get the needed primes - just run this algo up to sqrt(start), then you need to find each core prime's multiple just above (or is it just below?) the start, while minding the duplicates.

Wielding this into reality however is not that simple (or at least for me) :|

Can anybody help with updating this algorithm for such *primes(start) implementation (preferably in JavaScript as above)?

function * primes(start) {

  // how to amend it to support 'start' logic???

}

CodePudding user response：

This is the sieve of Eratosthenes:

primes = {2,3,...} \ U_(p<-primes) {p², p² p, ...} = {2} oddPrimes , oddPrimes = {3,5,...} \ U_{(p<-oddPrimes)} {p², p² 2*p, ...}

where \ is set difference (read "minus"), and U big union of sets.

An example might illustrate:

{2,3,4,5,6,7,8,9,10,11,12,13,14,15,...}
 \             |
   { 4,  6,  8,| 10,   12,   14,   16,   18, ...}
    \          .
             { 9,      12,      15,      18,    21,  ...}
       \ 
                                               { 25, 30, 35, ...}
          \                                     { 49, 56, 63, ...}
            \                                     { 121, 132, 143, ...}
              \  ........

or for the odd primes,

{3,5,7,9,11,13,15,17,19,21,23,25,27,29,31, ...}
 \                    |
     { 9,      15,    | 21,      27,      33, ...}
   \                          .
                            { 25,            35, ...}
     \                                        { 49, 63, ...}
      \                                         { 121, 143, ...}
       \  ........

The code you refer to in the question implements this approach. At any point in time, when considering a certain candidate, sieve exists in a certain state, as are the rest of the variables in the loop. So we just need to re-create this state directly.

Let's say we're considering i = 19 as the current candidate. At that point we have sieve = { (21, 6) } and p = 5.

This means for a candidate i, sieve contains multiples of all the primes q such that q^2 < i, and p is the next prime after the qs.

Each of the multiples is the smallest one not smaller than i, and there are no duplicates in the sieve. Then it's in a consistent state and can be continued from that point on.

Thus, in pseudocode:

primes() = ..... // as before

primesFrom(start) =
  let
  { primes.next()
  ; ps = takeWhile( (p => p^2 < start) , primes() )
  ; p = primes.next_value()
  ; mults = map( (p => let 
                       { s = 2*p 
                       ; n = (start-p)/s 
                       ; r = (start-p)%s
                       ; m = p   (r>0 ? n 1 : n)*s
                       }
                       ( m, s) )
                , ps)
  }
  for each (m,s) in mults
    if m in sieve
       increment m by s until m is not in sieve
    add (m,s) to sieve

and then do the same loop as before.

(up to any "off-by-one" errors :))

CodePudding user response：

I have modified Will Ness's amazing answer in two ways

First, to allow any start value.

function* primeGenerator() {
  yield 2;
  yield 3;
  yield 5;
  yield 7;
  const multiplesAndTheirSkipDistances = new Map();
  const trailingPrimes = primeGenerator();
  trailingPrimes.next(); // Discard the 2
  trailingPrimes.next(); // Discard the 3
  let trailingPrime = 3;
  let candidate = 7;
  while (true) {
    candidate  = 2;

    let skipDistance = multiplesAndTheirSkipDistances.get(candidate);
    if (skipDistance !== undefined) {
      multiplesAndTheirSkipDistances.delete(candidate);
    } else if (candidate < trailingPrime * trailingPrime) {
      yield candidate;
      continue;
    } else {
      skipDistance = 2 * trailingPrime;
      trailingPrime = trailingPrimes.next().value;
    }

    let aMultiple = candidate   skipDistance;
    while (multiplesAndTheirSkipDistances.has(aMultiple)) {
      aMultiple  = skipDistance;
    }

    multiplesAndTheirSkipDistances.set(aMultiple, skipDistance);
  }
}

const maxItems = 20;
const minimumPrintablePrime = 1e8;
const primeObject = primeGenerator();
let prime;
let count = 0;
do {
  prime = primeObject.next().value;
  if (prime > minimumPrintablePrime) {
    console.log(prime);
    count  ;
  }
} while (count < maxItems);

Second, to give an explanation of what it is doing.

function* primeGenerator() {
  yield 2;
  yield 3;
  yield 5;
  yield 7;
  const multiplesAndTheirSkipDistances = new Map();
  const trailingPrimes = primeGenerator();
  trailingPrimes.next(); // Discard the 2
  trailingPrimes.next(); // Discard the 3
  let trailingPrime = 3;
  let candidate = 7; //
  while (true) {
    console.log(" ");

    candidate  = 2;
    console.log(
      `- Candidate ${candidate}, trailing prime ${trailingPrime}, map of multiples:`,
      multiplesAndTheirSkipDistances,
      " -"
    );

    let skipDistance = multiplesAndTheirSkipDistances.get(candidate);
    if (skipDistance !== undefined) {
      console.log(
        `Rejecting because it is listed in the map of multiples. Can remove ${candidate} from map now.`
      );
      multiplesAndTheirSkipDistances.delete(candidate);
    } else if (candidate < trailingPrime * trailingPrime) {
      console.log(
        `Not in the map of multiples, and <${trailingPrime}*${trailingPrime}, so must be prime.`
      );
      yield candidate;
      continue;
    } else {
      skipDistance = 2 * trailingPrime;
      trailingPrime = trailingPrimes.next().value;
      console.log(
        `Rejecting because is square of trailing prime. Got new trailing prime, ${trailingPrime}.`
      );
    }

    let message = `Skip forward in steps of 2*${
      skipDistance / 2
    } to find a number not already in the map of multiples, `;
    let aMultiple = candidate   skipDistance;
    while (multiplesAndTheirSkipDistances.has(aMultiple)) {
      aMultiple  = skipDistance;
    }

    console.log(
      message  
      `namely ${aMultiple}. Mark as a multiple, with skipDistance ${skipDistance}.`
    );
    multiplesAndTheirSkipDistances.set(aMultiple, skipDistance);
  }
}

const maxItems = 20;
const primeObject = primeGenerator();
let prime;
let count = 0;
do {
  prime = primeObject.next().value;
  console.log(prime);
  count  ;
} while (count < maxItems);

The Ness algorithm is astonishingly economical on storage

It uses several instances of the prime generator. The "main" one, which is the one yielding primes to the calling program, has surprisingly small storage requirements.

At any one time, it stores a small portion of the infinite Eratosthenes sieve. Just enough for it to be able to yield the next prime confidently.

Whenever it comes across a "multiple" (i.e. a non-prime), it can reject that candidate. Along with the multiple, is stored the factor of which it is a multiple. For example, 15 is marked as a multiple of 3. When the algorithm comes to 15, it realises it is a multiple of 3, so it rejects the 15.

From now on, it no longer needs to remember that 15 is a multiple, because 15 will never be a candidate again. It can delete the 15 entry from its map of multiples, and replace it with the next multiple of 3. However the very next multiple will always be even, because all candidates (e.g. 15) and all factors (e.g. 3) are odd. Therefore it only needs to tell the map that 15 2x3 is a multiple, i.e. it can step forward by 2 x the factor. Moreover, if the resulting number, 21, is already marked as a multiple, it need not mark that number: it can advance by a further 2x3 to 15 2x3 2x3, etc, until it finds a number that is not already marked as a multiple.

Cleverly, this process ensures that every factor (e.g. 3) remains marking a multiple in the map forever. To assess a candidate N, the map simply stores one value for every prime up to sqrt(N).

When the candidate rises above the square of the largest factor, the algorithm needs to get a larger factor. It simply uses a second instance of itself to get the next prime in the sequence. I was at first worried about this creating vast memory demands, but that is not so. Assessing a number ~2^64, would need all the second instance for ~2^32, which in turn would call a third instance for ~2^16, and so on. Only a handful of instances of the generator need to be created even for primes of enormous size, and if an instance's map has factors up to F, the next instance only needs factors up to sqrt(F), which rapidly becomes small.