I want to parallelize the for loops and I can't seem to grasp the concept, every time I try to parallelize them it still works but it slows down dramatically.
for(i=0; i<nbodies; i){
for(j=i 1; j<nbodies; j) {
d2 = 0.0;
for(k=0; k<3; k) {
rij[k] = pos[i][k] - pos[j][k];
d2 = rij[k]*rij[k];
if (d2 <= cut2) {
d = sqrt(d2);
d3 = d*d2;
for(k=0; k<3; k) {
double f = -rij[k]/d3;
forces[i][k] = f;
forces[j][k] -= f;
}
ene = -1.0/d;
}
}
}
}
I tried using synchronization with barrier and critical in some cases but nothing happens or the processing simply does not end.
CodePudding user response:
Using barriers or other synchronizations will slow down your code, if the amount of unsynchronized work is not larger by a good factor. That is not the case with you. You probably need to reformulate your code to remove synchronization.
You are doing something like an N-body simulation. I've worked out a couple of solutions here: https://pages.tacc.utexas.edu/~eijkhout/pcse/html/omp-examples.html#N-bodyproblems
Also: your d2
loop is a reduction, so you can treat it like that, but it is probably enough if that variable is private to the i,j
iterations.
CodePudding user response:
You should always define your variables in their minimal required scope, especially if performance is an issue. (Note that if you do so your compiler can create more efficient code). Besides performance it also helps to avoid data race.
I think you have misplaced a curly brace and the condition in the first for
loop should be i<nbodies-1
. Variable ene
can be summed up using reduction and atomic operations have to be used to increase array forces
, so your code should look something like this:
#pragma omp parallel for reduction( :ene)
for(int i=0; i<nbodies-1; i){
for(int j=i 1; j<nbodies; j) {
double d2 = 0.0;
double rij[3];
for(int k=0; k<3; k) {
rij[k] = pos[i][k] - pos[j][k];
d2 = rij[k]*rij[k];
}
if (d2 <= cut2) {
double d = sqrt(d2);
double d3 = d*d2;
for(int k=0; k<3; k) {
double f = -rij[k]/d3;
#pragma omp atomic
forces[i][k] = f;
#pragma omp atomic
forces[j][k] -= f;
}
ene = -1.0/d;
}
}
}
}