I am benchmarking a software library I created in Go, and I encountered dissonance between runtime and ns/op. I am new to benchmarking, and Go's documentation and past stackoverflow questions do not conceptually cover benchmarking in depth, so I am seeking someone with more conceptual knowledge than me to help me (and other stackoverflow users in similar predicaments) understand what exactly is happening.
Benchmarking output for a task performed using native Go:
1000000000 0.6136 ns/op 0 B/op 0 allocs/op
PASS
ok github.com/gabetucker2/gostack/benchmark 0.862s
Benchmarking output for the same task performed using my software library:
1576087 805.3 ns/op 544 B/op 21 allocs/op
PASS
ok github.com/gabetucker2/gostack/benchmark 2.225s
Notice two things:
- The ns/op of my software library is around 1200 times slower than the ns/op of native Go
- The runtime of my software library is around 2 times slower than the runtime of native Go
It seems impossible to me that a very simple function from my software library should be 1200 times slower than native Go code, and it seems much more plausible that it is only 2 times slower... so what exactly is going on here?
Just in case it is useful, here are the Benchmark functions being called:
func test_Native_CreateArray() {
myArr := []int {1, 2, 3}
gogenerics.RemoveUnusedError(myArr)
}
func test_Gostack_CreateArray() {
myStack := MakeStack([]int {1, 2, 3})
gogenerics.RemoveUnusedError(myStack)
}
// native Go
func Benchmark_Native_CreateArray(b *testing.B) {
for i := 0; i < b.N; i {
test_Native_CreateArray()
}
}
// my software library "gostack"
func Benchmark_Gostack_CreateArray(b *testing.B) {
for i := 0; i < b.N; i {
test_Gostack_CreateArray()
}
}
Any clarity would be greatly appreciated.
CodePudding user response:
The first function ran 1_000_000_000 times with 0.61ns/op which is 0.61 seconds of the total runtime which took 0.862 seconds.
The second function ran 1_576_087 time with 805ns/op this takes around 1.26875 seconds of the 2.225 seconds. Forcing the second function to run 1_000_000_000 times should end up with around 805 seconds overhead.