What is 3 in numpy.resize(image,(IMG_HEIGHT,IMG

While trying to build a letter classifier in ML, this was a code for creating image data and the labels from the images from a folder using PIL.

def create_dataset_PIL(img_folder):

img_data_array=[]
class_name=[]
for dir1 in os.listdir(img_folder):
    print(dir1)
    for file in os.listdir(os.path.join(img_folder, dir1)):       
        image_path= os.path.join(img_folder, dir1,  file)
        image= np.array(Image.open(image_path))
        image= np.resize(image,(IMG_HEIGHT,IMG_WIDTH,3))
        image = image.astype('float32')
        image /= 255  
        img_data_array.append(image)
        class_name.append(dir1)
return img_data_array , class_name

Each image is 32 X 32 pixels in the dataset already and I am resizing it to a list of 32 X 32 X 3 dimension. But I don't understand, what is this 3rd dimension when all I need is 32 X 32 pixels?

I stumbled upon Numpy Resize/Rescale Image where I learned this may be interpolation parameter. Also from YouTube, I learned that interpolation is required while resizing images. But I don't know what to do with this extra data? Should size of input layer of my Neural Network be now 32 X 32 X 3 instead of just 32 X 32?

CodePudding user response：

3 represent the RGB (RED-GREEN-BLUE) values. Each pixel of the image represented by 3 pixels instead of one. In a black&white image, each pixel would be represented by [pixel], In RGB image each pixel would be represented by [pixel(R),pixel(G),pixel(B)]

In fact, each pixel of the image has 3 RGB values. These range between 0 and 255 and represent the intensity of Red, Green, and Blue. A lower value stands for higher intensity and a higher value for lower intensity. For instance, one pixel can be represented as a list of these three values [ 78, 136, 60]. Black would represented as [0, 0, 0].

And yes: Your input layer should match this 32X32X3.

CodePudding user response：

3'rd dimension in Digital image contains information about color present on pixel at (x,y)coordinate in the image, also called as color channel.

Most common channel types

RGB mode: if value is 3
for example: image_shape: [32,32,3]
Gray scale mode: if value is 1 for example: image_shape: [32,32,1]

If your ML model doesn't need colour feature you can use Scikit-image to convert into grayscale through rgb2gray

you can learn more about image usage in NumPy here