Home > Software engineering >  How to Optimise my code that computes the sum of all from less than 2 million
How to Optimise my code that computes the sum of all from less than 2 million

Time:03-23

I've tried this problem from Project Euler where I need to calculate the sum of all primes until two million.

This is the solution I've come up with -

#include <stdio.h>

int main() {
    long sum = 5;  // Already counting 2 and 3 in my sum.
    int i = 5;     // Checking from 5 
    int count =  0;
    while (i <= 2000000) {
        count = 0;
        for (int j = 3; j <= i / 2; j  = 2) {
            // Checking if i (starting from 5) is divisible from 3 
            if (i % j == 0) {   // to i/2 and only checking for odd values of j
                count = 1;
            }
        }
        if (count == 0) {
            sum  = i;
        }
        i  = 2;
    }
    printf("%ld ", sum);
}

It takes around 480 secs to run and I was wondering if there was a better solution or tips to improve my program.

________________________________________________________
Executed in  480.95 secs    fish           external
   usr time  478.54 secs    0.23 millis  478.54 secs
   sys time    1.28 secs    6.78 millis    1.28 secs

CodePudding user response:

With two little modifications your code becomes magnitudes faster:

#include <stdio.h>
#include <math.h>

int main() {
  long long sum = 5;    // we need long long, long might not be enough
                        // depending on your platform
  int i = 5;
  int count = 0;
  while (i <= 2000000) {
    count = 0;

    int limit = sqrt(i);                  // determine upper limit once and for all
    for (int j = 3; j <= limit; j  = 2) { // use upper limit sqrt(i) instead if i/2
        if (i % j == 0) {
          count = 1;
          break;                          // break out from loop as soon 
                                          // as number is not prime
        }
    }
    if (count == 0) {
      sum  = i;
    }
    i  = 2;
  }
  printf("%lld ", sum);                   // we need %lld for long long
}

All explanations are in the comments.

But there are certainly better and even faster ways to do this.

I ran this on my 10 year old MacPro and for the 20 million first primes it took around 30 seconds.

CodePudding user response:

This program computes near instantly (even in Debug...) the sum for 2 millions, just need one second for 20 millions (Windows 10, 10 years-old i7 @ 3.4 GHz, MSVC 2019).

Note: Didn't had time to set up my C compiler, it's why there is a cast on the malloc.

The "big" optimization is to store square values AND prime numbers, so absolutely no impossible divisor is tested. Since there is no more than 1/10th of primes within a given integer interval (heuristic, a robust code should test that and realloc the primes array when needed), the time is drastically cut.

#include <stdio.h>
#include <malloc.h>

#define LIMIT 2000000ul     // Computation limit.

typedef struct {
    unsigned long int   p ;     // Store a prime number.
    unsigned long int   sq ;    // and its square.
} prime ;

int main() {
    prime* primes = (prime*)malloc((LIMIT/10)*sizeof(*primes)) ;        // Store found primes. Can quite safely use 1/10th of the whole computation limit.
    unsigned long int primes_count=1 ;
    unsigned long int i = 3 ;
    unsigned long long int sum = 0 ;
    unsigned long int j = 0 ;
    int is_prime = 1 ;

    // Feed the first prime, 2.
    primes[0].p = 2 ;
    primes[0].sq = 4 ;
    sum = 2 ;
    // Parse all numbers up to LIMIT, ignoring even numbers.
    // Also reset the "is_prime" flag at each loop.
    for (i = 3 ; i <= LIMIT ; i =2, is_prime = 1 ) {
        // Parse all previously found primes.
        for (j = 0; j < primes_count; j  ) {
            // Above sqrt(i)? Break, i is a prime.
            if (i<primes[j].sq)
                break ;
            // Found a divisor? Not a prime (and break).
            if ((i % primes[j].p == 0)) {
                is_prime = 0 ;
                break ;
            }
        }
        // Add the prime and its square to the array "primes".
        if (is_prime) {
            primes[primes_count].p = i ;
            primes[primes_count  ].sq = i*i ;
            // Compute the sum on-the-fly
            sum  = i ;
        }
    }
    printf("Sum of all %lu primes: %llu\n", primes_count, sum);
    free(primes) ;
}

CodePudding user response:

Your program can easily be improved by stopping the inner loop earlier:

  • when i exceeds sqrt(j).
  • when a divisor has been found.

Also note that type long might not be large enough for the sum on all architectures. long long is recommended.

Here is a modified version:

#include <stdio.h>

int main() {
    long long sum = 5;  // Already counting 2 and 3 in my sum.
    long i = 5;     // Checking from 5 
    while (i <= 2000000) {
        int count = 0;
        for (int j = 3; j * j <= i; j  = 2) {
            // Checking if i (starting from 5) is divisible from 3 
            if (i % j == 0) {   // to i/2 and only checking for odd values of j
                count = 1;
                break;
            }
        }
        if (count == 0) {
            sum  = i;
        }
        i  = 2;
    }
    printf("%lld\n", sum);
}

This simple change drastically reduces the runtime! It is more than 1000 times faster for 2000000:

chqrlie> time ./primesum
142913828922

real    0m0.288s
user    0m0.264s
sys     0m0.004s

Note however that trial division is much less efficient than the classic sieve of Eratosthenes.

Here is a simplistic version:

#include <stdio.h>
#include <stdlib.h>

int main() {
    long max = 2000000;
    long long sum = 0;
    // Allocate an array of indicators initialized to 0
    unsigned char *composite = calloc(1, max   1);
    // For all numbers up to sqrt(max)
    for (long i = 2; i * i <= max; i  ) {
        // It the number is a prime
        if (composite[i] == 0) {
            // Set all multiples as composite. Multiples below the
            // square of i are skipped because they have already been
            // set as multiples of a smaller prime.
            for (long j = i * i; j <= max; j  = i) {
                composite[j] = 1;
            }
        }
    }
    for (long i = 2; i <= max; i  ) {
        if (composite[i] == 0)
            sum  = i;
    }
    printf("%lld\n", sum);
    free(composite);
    return 0;
}

This code is another 20 times faster for 2000000:

chqrlie> time ./primesum-sieve
142913828922

real    0m0.014s
user    0m0.007s
sys     0m0.002s

The sieve approach can be further improved in many ways for larger boundaries.

  • Related