I've tried this problem from Project Euler where I need to calculate the sum of all primes until two million.
This is the solution I've come up with -
#include <stdio.h>
int main() {
long sum = 5; // Already counting 2 and 3 in my sum.
int i = 5; // Checking from 5
int count = 0;
while (i <= 2000000) {
count = 0;
for (int j = 3; j <= i / 2; j = 2) {
// Checking if i (starting from 5) is divisible from 3
if (i % j == 0) { // to i/2 and only checking for odd values of j
count = 1;
}
}
if (count == 0) {
sum = i;
}
i = 2;
}
printf("%ld ", sum);
}
It takes around 480 secs to run and I was wondering if there was a better solution or tips to improve my program.
________________________________________________________
Executed in 480.95 secs fish external
usr time 478.54 secs 0.23 millis 478.54 secs
sys time 1.28 secs 6.78 millis 1.28 secs
CodePudding user response:
With two little modifications your code becomes magnitudes faster:
#include <stdio.h>
#include <math.h>
int main() {
long long sum = 5; // we need long long, long might not be enough
// depending on your platform
int i = 5;
int count = 0;
while (i <= 2000000) {
count = 0;
int limit = sqrt(i); // determine upper limit once and for all
for (int j = 3; j <= limit; j = 2) { // use upper limit sqrt(i) instead if i/2
if (i % j == 0) {
count = 1;
break; // break out from loop as soon
// as number is not prime
}
}
if (count == 0) {
sum = i;
}
i = 2;
}
printf("%lld ", sum); // we need %lld for long long
}
All explanations are in the comments.
But there are certainly better and even faster ways to do this.
I ran this on my 10 year old MacPro and for the 20 million first primes it took around 30 seconds.
CodePudding user response:
This program computes near instantly (even in Debug...) the sum for 2 millions, just need one second for 20 millions (Windows 10, 10 years-old i7 @ 3.4 GHz, MSVC 2019).
Note: Didn't had time to set up my C compiler, it's why there is a cast on the malloc
.
The "big" optimization is to store square values AND prime numbers, so absolutely no impossible divisor is tested. Since there is no more than 1/10th of primes within a given integer interval (heuristic, a robust code should test that and realloc the primes
array when needed), the time is drastically cut.
#include <stdio.h>
#include <malloc.h>
#define LIMIT 2000000ul // Computation limit.
typedef struct {
unsigned long int p ; // Store a prime number.
unsigned long int sq ; // and its square.
} prime ;
int main() {
prime* primes = (prime*)malloc((LIMIT/10)*sizeof(*primes)) ; // Store found primes. Can quite safely use 1/10th of the whole computation limit.
unsigned long int primes_count=1 ;
unsigned long int i = 3 ;
unsigned long long int sum = 0 ;
unsigned long int j = 0 ;
int is_prime = 1 ;
// Feed the first prime, 2.
primes[0].p = 2 ;
primes[0].sq = 4 ;
sum = 2 ;
// Parse all numbers up to LIMIT, ignoring even numbers.
// Also reset the "is_prime" flag at each loop.
for (i = 3 ; i <= LIMIT ; i =2, is_prime = 1 ) {
// Parse all previously found primes.
for (j = 0; j < primes_count; j ) {
// Above sqrt(i)? Break, i is a prime.
if (i<primes[j].sq)
break ;
// Found a divisor? Not a prime (and break).
if ((i % primes[j].p == 0)) {
is_prime = 0 ;
break ;
}
}
// Add the prime and its square to the array "primes".
if (is_prime) {
primes[primes_count].p = i ;
primes[primes_count ].sq = i*i ;
// Compute the sum on-the-fly
sum = i ;
}
}
printf("Sum of all %lu primes: %llu\n", primes_count, sum);
free(primes) ;
}
CodePudding user response:
Your program can easily be improved by stopping the inner loop earlier:
- when
i
exceedssqrt(j)
. - when a divisor has been found.
Also note that type long
might not be large enough for the sum on all architectures. long long
is recommended.
Here is a modified version:
#include <stdio.h>
int main() {
long long sum = 5; // Already counting 2 and 3 in my sum.
long i = 5; // Checking from 5
while (i <= 2000000) {
int count = 0;
for (int j = 3; j * j <= i; j = 2) {
// Checking if i (starting from 5) is divisible from 3
if (i % j == 0) { // to i/2 and only checking for odd values of j
count = 1;
break;
}
}
if (count == 0) {
sum = i;
}
i = 2;
}
printf("%lld\n", sum);
}
This simple change drastically reduces the runtime! It is more than 1000 times faster for 2000000:
chqrlie> time ./primesum
142913828922
real 0m0.288s
user 0m0.264s
sys 0m0.004s
Note however that trial division is much less efficient than the classic sieve of Eratosthenes.
Here is a simplistic version:
#include <stdio.h>
#include <stdlib.h>
int main() {
long max = 2000000;
long long sum = 0;
// Allocate an array of indicators initialized to 0
unsigned char *composite = calloc(1, max 1);
// For all numbers up to sqrt(max)
for (long i = 2; i * i <= max; i ) {
// It the number is a prime
if (composite[i] == 0) {
// Set all multiples as composite. Multiples below the
// square of i are skipped because they have already been
// set as multiples of a smaller prime.
for (long j = i * i; j <= max; j = i) {
composite[j] = 1;
}
}
}
for (long i = 2; i <= max; i ) {
if (composite[i] == 0)
sum = i;
}
printf("%lld\n", sum);
free(composite);
return 0;
}
This code is another 20 times faster for 2000000:
chqrlie> time ./primesum-sieve
142913828922
real 0m0.014s
user 0m0.007s
sys 0m0.002s
The sieve approach can be further improved in many ways for larger boundaries.