Home > other >  How to make part of the loop execute only in one of the workers?
How to make part of the loop execute only in one of the workers?

Time:06-23

I have a loop opening multiple connections to a remote SFTP-server. Each session needs to verify the remote host's key, but the existing list of keys only needs to be loaded once.

The libssh2 API I'm using requires an existing session for loading the list, so I cannot read it before entering into the loop -- as the sessions are initialized inside it.

Are there pragmas, that'll allow me to:

  1. Make one of threads -- I don't care, which one -- load the ~/.ssh/known_hosts.
  2. Cause all of the other threads to wait for that lucky one to do it?

I tried:

    #pragma omp for
    for (i = 0; i < jobs; i  ) {
        sessions[i] = libssh2_session_init();
        ...
        #pragma omp task
        {
            known = load_known_hosts(sessions[i], ...);
        }

        #pragma omp taskwait
        check_known_hosts(known, ...);
        ...
    }

This compiles, but does not work as intended -- the list is still uninitialized, when the check_known_hosts() tries to use it.

CodePudding user response:

My suggestion is the following (similar to what @deamcrash suggested, but initialization can also be done in parallel)

#pragma omp parallel
{
    ...
    #pragma omp for
    for (i = 0; i < jobs; i  ) {
    //initialization in parallel
    sessions[i] = libssh2_session_init();
    ...
    }

    #pragma omp single
    {
        // work done by a single thread 
        known = load_known_hosts(...);
        ...
    }

    #pragma omp for
    for (i = 0; i < jobs; i  ) {
        //other parallel work
        check_known_hosts(known, ...);
        ....
    }
    ...
}

CodePudding user response:

Without more context I would go for:

    #pragma omp single
    {
        sessions[0] = libssh2_session_init();
        ...
        known = load_known_hosts(sessions[0], ...);
        ...
        check_known_hosts(known, ...);
        ...
    }

    #pragma omp for
    for (i = 1; i < jobs; i  ) {
        sessions[i] = libssh2_session_init();
        ...
        check_known_hosts(known, ...);
        ...
    }

I would extract the first iteration out of the loop, executed before the parallel loop, and then execute the loop. To make it clean you could extract a method out of the code being executed in each loop iteration.

CodePudding user response:

My quick and dirty "solution" was to load the list for the first job only -- and make other workers spin, if they arrive to this point before the job-0 does:

    if (i == 0) {
        known = load_known_hosts(sessions[0], ...);
        if (known == NULL)
            errx(EX_SOFTWARE, "Couldn't load known hosts");
    } else while (known == NULL)
        pthread_yield();

I'm sure, there is a way to do this properly -- with OpenMP pragmas -- so awaiting the proper answer.

  • Related