I want to fetch data in parallels. So for example, I have an array of UserID
(string), and I want to fetch all the users of the array in parallel, and finally have an array of Users
:
func getUser(userIDs []string) {
var users []User
var user User
// We create a waitgroup - basically block until N tasks say they are done
wg := sync.WaitGroup{}
// We add 1 to the wait group - each worker will decrease it back
wg.Add(len(userIDs))
for id := range userIDs {
user, err = h.Repository.GetUserByID(id)
// Here is the problem. I should do a
go h.Repository.GetUserByID(id)
// to be parallel, but then I can not receive the user result
users = append(users, user)
wg.Done()
}
// Now we wait for everyone to finish - again, not a must.
// You can just receive from the channel N times, and use a timeout or something for safety
wg.Wait()
}
How can I call the function that gives me the users in parallel, and at the same time saving the value in the array?
Should the var user User
be inside the loop? Can I have race conditions if it is outside?
CodePudding user response:
Given you have no control over Repository.GetUserByID
, and there is no way to pass the channel directly to it, I would do something like:
func getUser(userIDs []string) {
var users []User
ch := make(chan User)
for id := range userIDs {
go func(ch chan User, id string){
user, err := h.Repository.GetUserByID(id)
if err != nil {
println(err)
return
}
ch <- user
}(ch, id)
}
for range userIDs {
users = append(users, <-ch)
}
}
Writing to a slice from a goroutine might cause concurrency issues. So it is not recommended. Golang has channels exactly for this.