Had a hard time expressing this question in a sentence. Here is the situation:
I'm trying to spawn off a set of goroutines to recurse over directories and find matching files. Then I collect those files and continue on to process them. However, the catch is that I don't know how many files each routine will find, so I'm having a hard time figuring out how to get the main thread to exit once all the routines are done
I could just make the channel buffer crazy big but that's not a good solution, this tool doesn't need to be 100% robust but good enough where it's not breaking all the time. Plus there's a chance it could turn up a lot of files
// start a routine to traverse each directory
fpchan := make(chan string, 100)
for _, dir := range numDirs {
fmt.Printf("Searching for file in %s\n", dir)
go findLogs(searchString, dir, fpchan)
}
// collect filepaths from channel
files := make([]string, 0, maxLogs)
for file := range fpchan { // I'LL GET STUCK WHEN EVERYTHING COMPLETES, NOTHING TO RECEIVE
if len(files) <= cap(files) {
files.append(file)
} else {
fmt.Println("Reached max logfile count of %d\n", maxLogs)
}
}
A waitgroup doesn't really work because the channel could fill up and the routines would be stuck (since I don't know how many results there will be ahead of time)
Is it kosher to send an empty string on the channel as a way for the goroutine to signal "complete"? Like the following:
// collect filepaths from channel
files := make([]string, 0, maxLogs)
for file := range fpchan {
if file == "" {
fmt.Println("goroutine finished")
numRunning--
if numRunning == 0 {
break
}
continue
}
if len(files) <= cap(files) {
files.append(file)
} else {
fmt.Println("Reached max logfile count of %d\n", maxLogs)
}
}
For this situation since an empty filepath would be invalid, it would work as a poor man's signal. It just feels like a hack for which there should be a better solution I would think
Any way to do this in a way that isn't horribly complicated (bunch of extra channels, non-blocking receives, etc)? If it's much more complicated than this I'd just do them in sequence but I thought it'd be a good chance to take advantage of concurrency
CodePudding user response:
You can use a WaitGroup
here. But you need to wait for the waitgroup in a goroutine, and in that goroutine you close the channel when the waitgroup has completed, so that your main loop terminates:
var wg sync.WaitGroup
// start a routine to traverse each directory
fpchan := make(chan string, 100)
for _, dir := range numDirs {
fmt.Printf("Searching for file in %s\n", dir)
wg.Add(1)
go func(dir string) {
findLogs(searchString, dir, fpchan)
wg.Done()
}(dir)
}
go func() {
wg.Wait()
close(fpchan)
}()
(your collection loop over fpchan
remains the same)