Add co-ordinates to each pixel in python-CodePudding

I am trying to add coordinates to each pixel of an image for this I am doing the following

import cv2
import numpy as np

img = cv2.imread('images/0001.jpg')
grayscale = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

np_grayscale = np.array(grayscale)

# make the array 3d
processed_image = np_grayscale[:, :, np.newaxis]

x = 0
y = 0
for pixel_line in reversed(processed_image):
    for pixel in pixel_line:
        pixel = np.append(pixel, [x, y])
        x  = 1
    y  = 1

print(processed_image)

But this does not seem to work because I am still getting the original array that is in the form

[[[255]
  [255]
  [255]
  ...
  [255]
  [255]
  [255]]

 ...
...
  ...
  [255]
  [255]
  [255]]]

Moreover I don't think this is the most efficient way of doing this because I read that append creates a new copy of the array, can someone please help

CodePudding user response：

You can create a mesh of indices using meshgrid() and stack() them with the original image:

import numpy as np

x = np.asarray([
    [0, 0, 0, 0],
    [0, 0, 0, 0],
    [0, 255, 0, 0],
    [0, 0, 0, 0],
])

indices = np.meshgrid(
    np.arange(x.shape[0]),
    np.arange(x.shape[1]),
    sparse=False
)

x = np.stack((x, *indices)).T

# array([[[  0,   0,   0],
#         [  0,   0,   1],
#         [  0,   0,   2],
#         [  0,   0,   3]],

#        [[  0,   1,   0],
#         [  0,   1,   1],
#         [255,   1,   2],
#         [  0,   1,   3]],

#        [[  0,   2,   0],
#         [  0,   2,   1],
#         [  0,   2,   2],
#         [  0,   2,   3]],

#        [[  0,   3,   0],
#         [  0,   3,   1],
#         [  0,   3,   2],
#         [  0,   3,   3]]])

x[0, 0, :] # 0, 0, 0
x[1, 2, :] # 255, 1, 2
x[-1, -1, :] # 0, 3, 3

CodePudding user response：

As you have a grayscale image, you can convert it to the 3-channeled image in which the first channel will contain the pixel values, and the other 2 channels will contain the coordinates. So, if you split your image into two or more, you will still have the coordinates of the original image in the other two channels, and also to visualize the image, you can simply use only the first channel. Here's how you can do this.


processed_image = np.zeros((grayscale.shape[0], grayscale.shape[1], 3), dtype=np.uint64)
processed_image[:, :, 0] = np.asarray(grayscale, dtype=np.uint64)

for i in range(processed_image.shape[0]):
    for j in range(processed_image.shape[1]):
        processed_image[i][j][1] = i
        processed_image[i][j][2] = j

print(processed_image)

# Displaying the image
cv2.imshow("img", np.array(processed_image[:, :, [0]], dtype=np.uint8))
cv2.waitKey(0)

Note: Take care of the datatypes of the numpy arrays. You cannot store the coordinates of the complete image in the np.uint8 array as it will only contain values from 0-255. Also, while displaying, you'll have to convert the first channel back to the np.uint8 datatype as OpenCV only understands images in this format (for integer pixel values). That is why I have used an array of np.uint64 datatype to store pixel values.