Why is slice faster than view() when constructing a multidimensional array from a vector?-CodePudding

Consider the following Vector:

numbers = Int32[1,2,3,4,5,6,7,8,9,10]

If I want to create a 2x5 matrix with the result:

1 2 3 4 5
6 7 8 9 10

I can't use reshape(numbers,2,5) or else I'll get:

1 3 5 7 9 
2 4 6 8 10

Using slice or view(), you can extract the top row and bottom row, convert them to a matrix row, and then use vcat().

I'm not saying using slice or view() is the only or best way of doing it, perhaps there is a faster way using reshape(), I just haven't figured it out.

numbers = Int32[1,2,3,4,5,6,7,8,9,10]

println("Using Slice:")
@time numbers_slice_matrix_top = permutedims(numbers[1:5])
@time numbers_slice_matrix_bottom = permutedims(numbers[6:10])
@time vcat(numbers_slice_matrix_top,numbers_slice_matrix_bottom)

println("Using view():")
@time numbers_view_matrix_top = permutedims(view(numbers,1:5))
@time numbers_view_matrix_bottom = permutedims(view(numbers,6:10))
@time vcat(numbers_view_matrix_top,numbers_view_matrix_bottom)

Output:

Using Slice:
  0.026763 seconds (5.48 k allocations: 329.155 KiB, 99.78% compilation time)
  0.000015 seconds (3 allocations: 208 bytes)
  0.301833 seconds (177.09 k allocations: 10.976 MiB, 93.30% compilation time)
Using view():
  0.103084 seconds (72.25 k allocations: 4.370 MiB, 99.90% compilation time)
  0.000011 seconds (2 allocations: 112 bytes)
  0.503787 seconds (246.63 k allocations: 14.537 MiB, 99.85% compilation time)

Why is slice faster? In a few rare cases view() was faster, but not by much.

From view() documentation:

For example, if x is an array and v = @view x[1:10], then v acts like a 10-element array, but its data is actually accessing the first 10 elements of x. Writing to a view, e.g. v[3] = 2, writes directly to the underlying array x (in this case modifying x[3]).

I don't know enough, but from my understanding, because view() has to convert the Vector to a matrix row (the original Vector) through another array (the view()), it's slower. Using slice we create a copy and don't have to worry about manipulating the original Vector.

CodePudding user response：

Your results actually show that view is faster not slicing. The point is that only the second tests is measuring the time to run the code while in the tests 1 and 3 you are measuring the time to compile the code.

This is a common misunderstanding how to run benchmarks in Julia. The point is that when a Julia function is run for the first time it needs to be compiled to an assembly code. Normally in production codes compile times do not matter because you compile only once for a fraction of a second and then run computations for many minutes, hours or days.

More than that - your code is using a global variable so in such a microbenchmark you are also measuring "how long does it take to resolve a global variable type" which is slow in Julia and not used in a production code.

Here is the correct way to run the benchmark using BenchmarkTools:

julia> @btime vcat(permutedims($numbers[1:5]),permutedims($numbers[6:10]));
  202.326 ns (7 allocations: 448 bytes)

julia> @btime vcat(permutedims(view($numbers,1:5)),permutedims(view($numbers,6:10)));
  88.736 ns (1 allocation: 96 bytes)

Note the interpolation symbol $ that makes numbers a type stable variable.