Parallel algorithm in golang to sum of elements in a vector-CodePudding

I am implementing some parallel algorithms as an exercise in Golang. Right now I am trying to sum all the elements in a vector, but to do that, I need a barrier. I googled around but I could not find anything that could help me.

This is what my code looks like:

package main

import (
    "fmt"
    "math"
    "sync"
)

func main() {
    var wg sync.WaitGroup

    sumWorkerFunc := func(k int, a []int, br *sync.WaitGroup) {
        bound := int(math.Ceil(math.Log2(float64(k))))
        for i := 1; i < bound; i   {
            if k%int(math.Pow(2, float64(i))) == 0 {
                a[k] = a[k-int(math.Pow(2, float64(i-1)))]   a[k]
            }

            /* barrier here */
        }

        wg.Done()
    }

    a := []int{0, 1, 2, 3, 4, 5, 6, 7}

    fmt.Println("Before:")
    fmt.Println(a)

    workers := 8
    wg.Add(workers)

    for k := 0; k < workers; k   {
        go sumWorkerFunc(k, a, br)
    }
    wg.Wait()

    fmt.Println("After:")
    fmt.Println(a)
}

I need to wait for all the workers to be done before starting the next iteration as they need the results for the next iteration. This is what I tried to do:

package main

import (
    "fmt"
    "math"
    "sync"
)

func main() {
    var wg sync.WaitGroup

    sumWorkerFunc := func(k int, a []int, br *sync.WaitGroup) {
        bound := int(math.Ceil(math.Log2(float64(k))))
        for i := 1; i < bound; i   {
            if k%int(math.Pow(2, float64(i))) == 0 {
                a[k] = a[k-int(math.Pow(2, float64(i-1)))]   a[k]
            }

            br.Done()
            br.Wait() // this should not be here
            br.Add(1)
        }

        wg.Done()
    }

    a := []int{0, 1, 2, 3, 4, 5, 6, 7}

    fmt.Println("Before:")
    fmt.Println(a)

    workers := 8
    wg.Add(workers)

    var barrier sync.WaitGroup
    barrier.Add(workers)

    for k := 0; k < workers; k   {
        go sumWorkerFunc(k, a, &barrier)
    }
    wg.Wait()

    fmt.Println("After:")
    fmt.Println(a)
}

But I cannot place a Wait() there because it will be called by all the workers. What would be a correct way of implementing a barrier there? I am starting to think that maybe this problem is oriented more towards the shared memory model which may not suitable for Golang.

Thanks!

EDIT:

I added an example of what I am trying to achieve:

5      2      1      3      5       8      1      1
|      |      |      |      |       |      |      |
|_ _ _ 7      |_ _ _ 4      |_ _ _ 13      |_ _ _ 2
       |             |              |             |
       |_ _ _ _ _ _ 11              |_ _ _ _ _ _ 15
                     |                            |
                     |_ _ _ _ _ _ _ _ _ _ _ _ _  26

where each worker is responsible for one element of the array.

CodePudding user response：

You need more than just the one WaitGroup to coordinate this routine. Look at this pattern:

func main() {
    const size = 100
    var (
        wg sync.WaitGroup
        a  [size]int
    )

    // Fill array a with all ones
    for i := 0; i < size; i   {
        go func(x int) {
            wg.Add(1)
            a[x] = 1
            wg.Done()
        }(i)
    }

    wg.Wait()

    fmt.Println(a)
}

Each worker adds itself to the WaitGroup before doing work, then removes itself when it's done. Do you see the problem? Try running it yourself a few times and see the output.

wg.Wait() is only valid if you call it after all of the expected wg.Add(x) have been called. Since the wg.Add(1) are inside the goroutine without any other synchronization, we don't know for sure how many workers were added by the time main goes to wg.Wait(). For example, it's possible that out of the 100 workers, 50 of them all called wg.Add(1) and then wg.Done() before the remaining 50 did anything. So, wg.Wait() continues while 50 of the workers still have not completed. All of this is to say that wg.Add(x) must be synchronized!

var barrier sync.WaitGroup
barrier.Add(workers)

for k := 0; k < workers; k   {
    go sumWorkerFunc(k, a, &barrier)
}

In this case, the Add is synchronized because it happens before the workers begin to execute.

br.Done()
br.Wait()
br.Add(1)

There's no problem necessarily with having all of your workers call Wait, the problem is that the Add is not synchronized. You cannot use the WaitGroup to synchronize itself here. You need some additional feature (another WaitGroup, a lock, a channel, etc.) to create synchronization across the workers between rounds.

Here's the easiest solution I could come up with, which makes one WaitGroup per each round:

func main() {
    const (
        workers = 3
        rounds  = 5
    )

    work := func(i int, roundWgs []sync.WaitGroup) {
        for r := 0; r < rounds; r   {
            // This is the "work" we do each round
            fmt.Printf("round: %v, worker %v\n", r, i)

            // We are finished the current round, and will wait for the group.
            roundWgs[r].Done()
            roundWgs[r].Wait()
        }
    }

    // Each round of work has it's own WaitGroup, in which each worker must finish.
    var roundWgs = make([]sync.WaitGroup, rounds)
    for i := 0; i < rounds; i   {
        roundWgs[i].Add(workers)
    }

    // wg is our outermost WaitGroup, which waits until all work is done.
    var wg sync.WaitGroup
    wg.Add(workers)
    for i := 0; i < workers; i   {
        go func(j int) {
            defer wg.Done()
            work(j, roundWgs)
        }(i)
    }
    wg.Wait()
}

Output:

round: 0, worker 2
round: 0, worker 0
round: 0, worker 1
round: 1, worker 1
round: 1, worker 0
round: 1, worker 2
round: 2, worker 2
round: 2, worker 1
round: 2, worker 0
round: 3, worker 0
round: 3, worker 2
round: 3, worker 1
round: 4, worker 1
round: 4, worker 2
round: 4, worker 0

Again, the Adds are synchronized because they all happen before any workers begin.

CodePudding user response：

I think what you are trying to do is more like a shared memory scenario. It would be better if instead of deferring the wait you can put it just after the loop that starts the goroutines. And you should use a mutex to ensure mutual exclusion.

And I don't think order matters for this problem.