Implementing convolution from scratch in Julia-CodePudding

I am trying to implement convolution by hand in Julia. I'm not too familiar with image processing or Julia, so maybe I'm biting more than I can chew.

Anyway, when I apply this method with a 3*3 edge filter edge = [0 -1 0; -1 4 -1; 0 -1 0] as convolve(img, edge), I am getting an error saying that my values are exceeding the allowed values for the RGBA type.

Code

function convolve(img::Matrix{<:Any}, kernel)
    (half_kernel_w, half_kernel_h) = size(kernel) .÷ 2
    (width, height) = size(img)
    cpy_im = copy(img)
    for row ∈ 1 half_kernel_h:height-half_kernel_h
        for col ∈ 1 half_kernel_w:width-half_kernel_w
            from_row, to_row = row .  (-half_kernel_h, half_kernel_h)
            from_col, to_col = col .  (-half_kernel_h, half_kernel_h)
            cpy_im[row, col] = sum((kernel .* RGB.(img[from_row:to_row, from_col:to_col])))
        end
    end
    cpy_im
end

Error (original)

ArgumentError: element type FixedPointNumbers.N0f8 is an 8-bit type representing 256 values from 0.0 to 1.0, but the values (-0.0039215684f0, -0.007843137f0, -0.007843137f0, 1.0f0) do not lie within this range.
See the READMEs for FixedPointNumbers and ColorTypes for more information.

I am able to identify a simple case where such error may occur (a white pixel surrounded by all black pixels or vice-versa). I tried "fixing" this by attempting to follow the advice here from another stackoverflow question, but I get more errors to the effect of Math on colors is deliberately undefined in ColorTypes, but see the ColorVectorSpace package..

Code attempting to apply solution from the other SO question

function convolve(img::Matrix{<:Any}, kernel)
    (half_kernel_w, half_kernel_h) = size(kernel) .÷ 2
    (width, height) = size(img)
    cpy_im = copy(img)
    for row ∈ 1 half_kernel_h:height-half_kernel_h
        for col ∈ 1 half_kernel_w:width-half_kernel_w
            from_row, to_row = row .  [-half_kernel_h, half_kernel_h]
            from_col, to_col = col .  [-half_kernel_h, half_kernel_h]
            cpy_im[row, col] = sum((kernel .* RGB.(img[from_row:to_row, from_col:to_col] ./ 2 .  128)))
        end
    end
    cpy_im
end

Corresponding error

MethodError: no method matching  (::ColorTypes.RGBA{Float32}, ::Int64)
Math on colors is deliberately undefined in ColorTypes, but see the ColorVectorSpace package.

Closest candidates are:
 (::Any, ::Any, !Matched::Any, !Matched::Any...) at operators.jl:591
 (!Matched::T, ::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} at int.jl:87
 (!Matched::ChainRulesCore.AbstractThunk, ::Any) at ~/.julia/packages/ChainRulesCore/a4mIA/src/tangent_arithmetic.jl:122

Now, I can try using convert etc., but when I look at the big picture, I start to wonder what the idiomatic way of solving this problem in Julia is. And that is my question. If you had to implement convolution by hand from scratch, what would be a good way to do so?

EDIT: Here is an implementation that works, though it may not be idiomatic

function convolve(img::Matrix{<:Any}, kernel)
    (half_kernel_h, half_kernel_w) = size(kernel) .÷ 2
    (height, width) = size(img)
    cpy_im = copy(img)
    # println(Dict("width" => width, "height" => height, "half_kernel_w" => half_kernel_w, "half_kernel_h" => half_kernel_h, "row range" => 1 half_kernel_h:(height-half_kernel_h), "col range" => 1 half_kernel_w:(width-half_kernel_w)))
    for row ∈ 1 half_kernel_h:(height-half_kernel_h)
        for col ∈ 1 half_kernel_w:(width-half_kernel_w)
            from_row, to_row = row .  (-half_kernel_h, half_kernel_h)
            from_col, to_col = col .  (-half_kernel_w, half_kernel_w)
            vals = Dict()
            for method ∈ [red, green, blue, alpha]
                x = sum((kernel .* method.(img[from_row:to_row, from_col:to_col])))
                if x > 1
                    x = 1
                elseif x < 0
                    x = 0
                end
                vals[method] = x
            end
            cpy_im[row, col] = RGBA(vals[red], vals[green], vals[blue], vals[alpha])
        end
    end
    cpy_im
end

CodePudding user response：

First of all, the error

Math on colors is deliberately undefined in ColorTypes, but see the ColorVectorSpace package.

should direct you to read the docs of the ColorVectorSpace package, where you will learn that using ColorVectorSpace will now enable math on RGB types. (The absence of default support it deliberate, because the way the image-processing community treats RGB is colorimetrically wrong. But everyone has agreed not to care, hence the ColorVectorSpace package.)

Second,

ArgumentError: element type FixedPointNumbers.N0f8 is an 8-bit type representing 256 values from 0.0 to 1.0, but the values (-0.0039215684f0, -0.007843137f0, -0.007843137f0, 1.0f0) do not lie within this range.

indicates that you're trying to write negative entries with an element type, N0f8, that can't support such values. Instead of cpy_im = copy(img), consider something like cpy_im = [float(c) for c in img] which will guarantee a floating-point representation that can support negative values.

Third, I would recommend avoiding steps like RGB.(img...) when nothing about your function otherwise addresses whether images are numeric, grayscale, or color. Fundamentally the only operations you need are scalar multiplication and addition, and it's better to write your algorithm generically leveraging only those two properties.

CodePudding user response：

Tim Holy's answer above is correct - keep things simple and avoid relying on third-party packages when you don't need to.

I might point out that another option you may not have considered is to use a different algorithm. What you are implementing is the naive method, whereas many convolution routines using different algorithms for different sizes, such as im2col and Winograd (you can look these two up, I have a website that covers the idea behind both here).

The im2col routine might be worth doing as essentially you can break the routine in several pieces:

Unroll all 'regions' of the image to do a dot-product with the filter/kernel on, and stack them together into a single matrix.
Do a matrix-multiply with the unrolled input and filter/kernel.
Roll the output back into the correct shape.

It might be more complicated overall, but each part is simpler, so you may find this easier to do. A matrix multiply routine is definitely quite easy to implement. For 1x1 (single-pixel) convolutions where the image and filter have the same ordering (i.e. NCHW images and FCHW filter) the first and last steps are trivial as essentially no rolling/unrolling is necessary.

A final word of advice - start simpler and add in the code to handle edge-cases, convolutions are definitely fiddly to work with.

Hope this helps!