Openmp not parallelizing for loop, gives sequential execution-CodePudding

I am trying to parallelize a for loop in my code base that should be embarrassingly parallel. However, Openmp is not doing so and rather is executing everything in sequential order. program complied using g , std=c 11, I have executed a small program to ensure if openmp works or not, and it worked just fine.

The code block I am trying to parallelize is given below

void class_tmv::activate(class_tmv &result, const &a, const &b, const &c, const &d, const &e, f) const
    {
        result.clear();
        #pragma omp parallel for 
        for (unsigned int i = 0; i < tms.size();   i)
        {
            class_TM tmTemp;
            tms[i].activate(tmTemp, a, b, c, d, e, f);
            result.tms[i] = tmTemp;
        }
    }

Class_tmv has class variable tms, that is essentially a vector of Class_TM objects. Class_TM has a method also named activate that gets called above, it is defined as

    inline void Class_TM::activate(Class_TM &result, const &a, const &b, const &c, const &d, const &e, f) const
    {
        
        result.clear();
        Class_TM tmTemp;

        if (condition_1)
        {
            this->S_T(tmTemp, a, b, c, d, e, f);
        }
        else if (condition_2)
        {
            this->T_T(tmTemp, a, b, c, d, e, f);
        }
        else
        {
            cout << "The activation fundtion can be parsed." << endl;
        }
        result = tmTemp;
    }

S_T and T_T are other methods in class_TM.

The issue I'm having is the overall execution of the system is completely sequential, and the loop I'm trying to parallelize isn't working.

Any suggestions on what may be going wrong are extremely helpful. Any other solutions not related to openmp are also welcomed.

(This is my first time working on parallel applications)

CodePudding user response：

Did you use the -fopenmp flag when compiling your code?

CodePudding user response：

Your full code or more context on the command used to compile would be helpful, but here are some pointers in the meantime:

As other answer mentioned, you should compile with -fopenmp or similar (depending on compiler). However, if you mention that you did a test and verified OpenMP worked correctly, then it's likely you did include that option and also the headers on your .c files.
I noticed that on the for you attempt to use the pragma you use i, so that means that you are skipping the 0th element of tms and result.tms. I don't know if this is what you want, or you instead need i .
As it seems you are experiencing, if you compile with OpenMP #pragmas and the optimizer sees no way to parallelize your for, or it is not in the "Canonical Form", then it will not be optimized. The Canonical Form of a for loop, among other things, requires that the stopping criteria is fixed and does not change across iterations (in this case the tms.size()). If the call to tms[i].activate() modifies tms's size, then that would not follow Canonical form and would not be optimized (or the subsequent calls to S_T and T_T) Check if this is your case on your context.