I am trying to port the basic neural network application from Andrew Ng's course from Python to Julia but got stuck in this part.
I am using my own data set, and therefore I am creating my own solution to process images and resize them. In order to have the exact same as in the Python code (and to have all images as vectors inside one matrix) I need to convert them from RGB to Array type so I can store them as columns in a matrix, but I keep an error and I can't seem to find information anywhere else.
I'm currently using an adapted version of the idea presented here.
using Images, FileIO, TestImages
cat_path = "path/Cat/"
cat_imgs = joinpath.(cat_path, readdir(cat_path))
function process_image(path_vec::Vector{String}, h::Int64, w::Int64, label::Int64)
result = zeros((h*w), length(path_vec))
class = []
for i in enumerate(path_vec)
img = load(i[2])::Array{RGB{N0f8},2}
img = imresize(img,(h,w))::Array{RGB{N0f8},2}
img = vec(img)::Vector{RGB{N0f8}}
result[:,i[1]] = img # this is the line where I believe Im getting the error
push!(class, label)
end
return result, class
end
If I try to change the images from RGB to Gray it works (which makes sense as they will have just one channel and will easily become an array), but if I want to keep all channels in the vector I can't just use save them into the matrix as a Vector{RGB{N0f8}}, and if I try to use img = convert(Array{Float64,1},img)
I get the error: MethodError: Cannot
convert an object of type RGB{N0f8} to an object of type Float64
I'm not sure how to make the code easily reproducible, but I believe that if you create a folder with a single image and update the file paths it should be possible. Or just running the individual lines inside the function using a test image:
using TestImages
img = testimage("mandrill")
CodePudding user response:
Just use channelview
. Note that RGB values will be available as the first dimension.
julia> channelview(testimage("mandrill"))
3×512×512 reinterpret(reshape, N0f8, ::Array{RGB{N0f8},2}) with eltype N0f8:
[:, :, 1] =
0.643 0.471 0.388 … 0.475 0.494 0.035
0.588 0.49 0.29 0.58 0.663 0.043
0.278 0.243 0.122 0.608 0.659 0.047
;;; …
[:, :, 512] =
0.702 0.471 0.376 … 0.318 0.314 0.016
0.737 0.541 0.314 0.314 0.247 0.02
0.463 0.29 0.192 0.235 0.278 0.008
CodePudding user response:
After Dan's suggestion I managed to find a solution, although probably a slow/inefficient one:
function process_image(path_vec::Vector{String}, h::Int64, w::Int64, label::Int64)
result = zeros((h*w*3), length(path_vec))
class = []
for i in enumerate(path_vec)
img = load(i[2])::Array{RGB{N0f8},2}
img = imresize(img,(h,w))::Array{RGB{N0f8},2}
img = vec(img)::Vector{RGB{N0f8}}
img = [temp(img[i]) for i = 1:length(img), temp in [red, green, blue]]
img = reshape(img, ((h*w*3),1))
result[:,i[1]] = img
push!(class, label)
end
return result, class
end
In case it isn't clear in the code, what I did was extracting 3 arrays for each color channel into a matrix, which produces a 1024x3 Array{N0f8,2}. Then you can reshape this array into a 3072x1 Array{N0f8,2}. Once reshaped you can add it to a zeros matrix and it converts to Array{Float64,2}.
Not super happy that I had to manually input the number of channels to get the right dimension, but it works.