I wrote this function uniq
that takes in a sorted slice of int
s
and returns the slice with duplicates removed:
func uniq(x []int) []int {
i := 0
for i < len(x)-1 {
if x[i] == x[i 1] {
copy(x[i:], x[i 1:])
x = x[:len(x)-1]
} else {
i
}
}
return x
}
and uniq2
, a rewrite of uniq
with the same results:
func uniq2(x []int) []int {
i := 0
l := len(x)
for i < l-1 {
if x[i] == x[i 1] {
copy(x[i:], x[i 1:])
l--
} else {
i
}
}
return x[:l]
}
The only difference between the two functions
is that in uniq2
, instead of slicing x
and directly accessing len(x)
each time,
I save len(x)
to a variable l
and decrement it whenever I shift the slice.
I thought that uniq2
would be slightly faster than uniq
because len(x)
would no longer be called iteration,
but in reality, it is inexplicably much slower.
With this test that generates a random sorted slice
and calls uniq
/uniq2
on it 1000 times,
which I run on Linux:
func main() {
rand.Seed(time.Now().Unix())
for i := 0; i < 1000; i {
_ = uniq(genSlice())
//_ = uniq2(genSlice())
}
}
func genSlice() []int {
x := make([]int, 0, 1000)
for num := 1; num <= 10; num {
amount := rand.Intn(1000)
for i := 0; i < amount; i {
x = append(x, num)
}
}
return x
}
$ go build uniq.go
$ time ./uniq
uniq
usually takes 5--6 seconds to finish.
while uniq2
is more than two times slower,
taking between 12--15 seconds.
Why is uniq2
, where I save the slice length to a variable,
so much slower than uniq
, where I directly call len
?
Shouldn't it slightly faster?
CodePudding user response:
You expect roughly the same execution time because you think they do roughly the same thing.
The only difference between the two functions is that in
uniq2
, instead of slicingx
and directly accessinglen(x)
each time, I savelen(x)
to a variablel
and decrement it whenever I shift the slice.
This is wrong.
The first version does:
copy(x[i:], x[i 1:])
x = x[:len(x)-1]
And second does:
copy(x[i:], x[i 1:])
l--
The first difference is that the first assigns (copies) a slice header which is a reflect.SliceHeader
value, being 3 integer (24 bytes on 64-bit architecture), while l--
does a simple decrement, it's much faster.
But the main difference does not stem from this. The main difference is that since the first version changes the x
slice (the header, the length included), you end up copying less and less elements, while the second version does not change x
and always copies to the end of the slice. x[i 1:]
is equivalent to x[x 1:len(x)]
.
To demonstrate, imagine you pass a slice with length=10 and having all equal elements. The first version will copy 9 elements first, then 8, then 7 etc. The second version will copy 9 elements first, then 9 again, then 9 again etc.
Let's modify your functions to count the number of copied elements:
func uniq(x []int) []int {
count := 0
i := 0
for i < len(x)-1 {
if x[i] == x[i 1] {
count = copy(x[i:], x[i 1:])
x = x[:len(x)-1]
} else {
i
}
}
fmt.Println("uniq copied", count, "elements")
return x
}
func uniq2(x []int) []int {
count := 0
i := 0
l := len(x)
for i < l-1 {
if x[i] == x[i 1] {
count = copy(x[i:], x[i 1:])
l--
} else {
i
}
}
fmt.Println("uniq2 copied", count, "elements")
return x[:l]
}
Testing it:
uniq(make([]int, 1000))
uniq2(make([]int, 1000))
Output is:
uniq copied 499500 elements
uniq2 copied 998001 elements
uniq2()
copies twice as many elements!
If we test it with a random slice:
uniq(genSlice())
uniq2(genSlice())
Output is:
uniq copied 7956671 elements
uniq2 copied 11900262 elements
Again, uniq2()
copies roughly 1.5 times more elements! (But this greatly depends on the random numbers.)
Try the examples on the Go Playground.
The "fix" is to modify uniq2()
to copy until l
:
copy(x[i:], x[i 1:l])
l--
With this "appropriate" change, performance is roughly the same.