I am making a program that monitors different webpages, each time a new url is added to a page, I would like to start a new goroutine to scrape the new url. I am trying to simulate this like this:
package main
import (
"fmt"
"sync"
"time"
)
func main() {
var Wg sync.WaitGroup
link := make(chan string)
startList := []string{}
go func() {
for i := 0; i < 20; i {
//should simulate the monitoring of the original web page
nextLink := fmt.Sprintf("cool-website-%d", i)
link <- nextLink
}
}()
for i := 0; i < 20; i {
newLink := <-link
startList = append(startList, newLink)
Wg.Add(1)
go simulateScraping(i, startList[i])
Wg.Done()
}
Wg.Wait()
}
func simulateScraping(i int, link string) {
fmt.Printf("Simulating process %d\n", i)
fmt.Printf("scraping www.%s.com\n", link)
time.Sleep(time.Duration(30) * time.Second)
fmt.Printf("Finished process %d\n", i)
}
This results in the following error fatal error: all goroutines are asleep - deadlock!
.
How do I only start the simulateScraping function each time that newLink is updated or when startList is appended to?
Thanks!
CodePudding user response:
I see several problems with the code.
- Wait group is useless in the code because
Wg.Done
is called immediately and does not wait until thesimulateScraping
finishes, because it's running in parallel.
To fix this, the closure function could be used
go func(i int) {
simulateScraping(i, newLink)
Wg.Done()
}(i)
- Instead of an increment loop, I would use for-each range loop. It allows code to be executed as soon as a new value get to a channel and automatically breaks when the channel closes.
var i int
for newLink := range link {
Wg.Add(1)
go func(i int) {
simulateScraping(i, newLink)
Wg.Done()
}(i)
i
}
Wg.Wait()
startList := []string{}
Looks useless. Not sure how it was supposed to be used.Channel must be closed.
go func() {
for i := 0; i < 20; i {
//should simulate the monitoring of the original web page
nextLink := fmt.Sprintf("cool-website-%d", i)
link <- nextLink
}
close(link) // Closing the channel
}()
The whole code
package main
import (
"fmt"
"sync"
"time"
)
func main() {
var Wg sync.WaitGroup
link := make(chan string)
go func() {
for i := 0; i < 20; i {
//should simulate the monitoring of the original web page
nextLink := fmt.Sprintf("cool-website-%d", i)
link <- nextLink
}
close(link)
}()
var i int
for newLink := range link {
Wg.Add(1)
go func(i int) {
simulateScraping(i, newLink)
Wg.Done()
}(i)
i
}
Wg.Wait()
}
func simulateScraping(i int, link string) {
fmt.Printf("Simulating process %d\n", i)
fmt.Printf("scraping www.%s.com\n", link)
time.Sleep(3 * time.Second)
fmt.Printf("Finished process %d\n", i)
}
Here is a good talk about "Concurrency Patterns In Go"