I'm working in a project of images defects clustering. Each image is associated to a specific defect type ( And a 3d array of pixels using readJPEG
).
An example of images is the following : https://i.stack.imgur.com/pO9XY.jpg
library(jpeg)
im <- readJPEG("C:/Users/Rayane_2/Desktop/Data/PCB1/PCB/PCB_USED/01.jpg")
dim(im)
[1] 1586 3034 3
The desired process is described as follow :
For each picture in specific directory :
1/ Convert the JPG picture to a 3d array ** ( RGB data of jpg image is a 3d array ).
2/ Summarize that 3d array in a **vector** of statistics using a function like `stats()` .
3/ Return this vector and continue to build a full clustering dataset.
I'm searching to convert im[,,1]
, im[,,2]
, im[,,3]
as vectors as.vector()
.
After that i need to extract some statistics , something like :
stats <-function(im){
return(c(min(as.vector(im[,,1])),max(as.vector(im[,,1])),sum(as.vector(im[,,1])),range(as.vector(im[,,1])),var(as.vector(im[,,1])), min(as.vector(im[,,2])),max(as.vector(im[,,2])),sum(as.vector(im[,,2])),range(as.vector(im[,,2])),var(as.vector(im[,,2])),min(as.vector(im[,,3])),max(as.vector(im[,,3])),sum(as.vector(im[,,3])),range(as.vector(im[,,3])),var(as.vector(im[,,3])))
}
There are possible solutions to obtain current statistics using r packages such descr()
in {summarytools}
, see R statistics package
Because of im
3d-array high dimensions, the running is very slow
dim(im)
[1] 1586 3034 3
Question :
I'm searching possible solutions , any other R functions / packages that can do such task in a very fast way ?
Thanks ,
CodePudding user response:
We could loop over the the third dimension with apply
and MARGIN = 3
out <- apply(im, 3, function(x) c(min =min(x), max = max(x), sum = sum(x)))
If there are multiple files, read them into a list
first
lst1 <- lapply(jpgfiles, function(file) apply(readJPEG(file), 3,
function(x) c(min = min(x), max = max(x), sum = sum(x))))