I am making a program that crawls websites and return status of them.
I wrote this program with different approaches. The first one using mutexes to prevent concurrent writes to map so that I can get rid of the data race. Then for the same purpose, I implement it with channels. But when I was doing benchmarks I realized that implementing it with channels much faster than implementing it mutexes. I was wondering why it is happening? Why mutexes lacks of performance? am I doing something wrong with mutexes?
Benchmark result:
Code
package concurrency
import "sync"
type WebsiteChecker func(string) bool
type result struct {
string
bool
}
func CheckWebsites(wc WebsiteChecker, urls []string) map[string]bool {
results := make(map[string]bool)
var wg sync.WaitGroup
var mu sync.Mutex
for _, url := range urls {
wg.Add(1)
go func(u string) {
defer wg.Done()
mu.Lock()
results[u] = wc(u)
mu.Unlock()
}(url)
}
wg.Wait()
return results
}
func CheckWebsitesChannel(wc WebsiteChecker, urls []string) map[string]bool {
results := make(map[string]bool)
resultChannel := make(chan result)
for _, url := range urls {
go func(u string) {
resultChannel <- result{u, wc(u)}
}(url)
}
for i := 0; i < len(urls); i {
r := <-resultChannel
results[r.string] = r.bool
}
return results
}
Test code
package concurrency
import (
"reflect"
"testing"
"time"
)
func mockWebsiteChecker(url string) bool {
time.Sleep(20 * time.Millisecond)
if url == "https://localhost:3000" {
return false
}
return true
}
func TestCheckWebsites(t *testing.T) {
websites := []string{
"https://google.com",
"https://localhost:3000",
"https://blog.gypsydave5.com",
}
want := map[string]bool{
"https://google.com": true,
"https://blog.gypsydave5.com": true,
"https://localhost:3000": false,
}
got := CheckWebsites(mockWebsiteChecker, websites)
if !reflect.DeepEqual(got, want) {
t.Errorf("got %v, want %v", got, want)
}
}
func BenchmarkCheckWebsites(b *testing.B) {
urls := make([]string, 1000)
for i := 0; i < len(urls); i {
urls[i] = "a url"
}
b.ResetTimer()
for i := 0; i < b.N; i {
CheckWebsites(mockWebsiteChecker, urls)
}
}
func BenchmarkCheckWebsitesChannel(b *testing.B) {
urls := make([]string, 1000)
for i := 0; i < len(urls); i {
urls[i] = "a url"
}
b.ResetTimer()
for i := 0; i < b.N; i {
CheckWebsitesChannel(mockWebsiteChecker, urls)
}
}
CodePudding user response:
It seems to me that with the mutex version of the code you're not only protecting the results
map but the wc
too (the call can only take place once the lock has been acquired so effectively you are serializing the calls). Send to chan only locks the channel once the right side is ready so calls to wc
can happen concurrently. See does code like
go func(u string) {
defer wg.Done()
r := wc(u)
mu.Lock()
results[u] = r
mu.Unlock()
}(url)
perform better with mutex.