While trying to build a letter classifier in ML, this was a code for creating image data and the labels from the images from a folder using PIL.
def create_dataset_PIL(img_folder):
img_data_array=[]
class_name=[]
for dir1 in os.listdir(img_folder):
print(dir1)
for file in os.listdir(os.path.join(img_folder, dir1)):
image_path= os.path.join(img_folder, dir1, file)
image= np.array(Image.open(image_path))
image= np.resize(image,(IMG_HEIGHT,IMG_WIDTH,3))
image = image.astype('float32')
image /= 255
img_data_array.append(image)
class_name.append(dir1)
return img_data_array , class_name
Each image is 32 X 32
pixels in the dataset already and I am resizing it to a list of 32 X 32 X 3
dimension.
But I don't understand, what is this 3rd dimension when all I need is 32 X 32 pixels?
I stumbled upon Numpy Resize/Rescale Image where I learned this may be interpolation parameter. Also from YouTube, I learned that interpolation is required while resizing images. But I don't know what to do with this extra data? Should size of input layer of my Neural Network be now 32 X 32 X 3
instead of just 32 X 32
?
CodePudding user response:
3 represent the RGB (RED-GREEN-BLUE) values. Each pixel of the image represented by 3 pixels instead of one. In a black&white image, each pixel would be represented by [pixel], In RGB image each pixel would be represented by [pixel(R),pixel(G),pixel(B)]
In fact, each pixel of the image has 3 RGB values. These range between 0 and 255 and represent the intensity of Red, Green, and Blue. A lower value stands for higher intensity and a higher value for lower intensity. For instance, one pixel can be represented as a list of these three values [ 78, 136, 60]. Black would represented as [0, 0, 0].
And yes: Your input layer should match this 32X32X3.
CodePudding user response:
3'rd dimension in Digital image contains information about color present on pixel at (x,y)coordinate in the image, also called as color channel.
Most common channel types
- RGB mode: if value is 3
for example: image_shape: [32,32,3] - Gray scale mode: if value is 1 for example: image_shape: [32,32,1]
If your ML model doesn't need colour feature you can use Scikit-image to convert into grayscale through rgb2gray
you can learn more about image usage in NumPy here