Background:
I'm playing around with Google's body segmentation API for Python. Since it's a library originally written for js (tensorflow.js), the
the torso should be green, so I know that :
torso = np.array([175, 240, 91])
The right part of the head should be purple so it's [110, 64, 170], and so on...
My approach:
# get prediction result
img = "front_pic"
img_filename = img ".png"
image = tf.keras.preprocessing.image.load_img(img_filename)
image_array = tf.keras.preprocessing.image.img_to_array(image)
result = bodypix_model.predict_single(image_array)
# simple mask
mask = result.get_mask(threshold=0.75)
# colored mask (separate colour for each body part)
colored_mask = result.get_colored_part_mask(mask)
tf.keras.preprocessing.image.save_img(img '_cmask' '.jpg',
colored_mask
)
# color codes
right_head = np.array([110, 64, 170])
left_head = np.array([143, 61, 178])
torso = np.array([175, 240, 91])
left_feet = np.array([84, 101, 214])
right_feet = np.array([99, 81, 195])
left_arm_shoulder = np.array([210, 62, 167])
right_arm_shoulder = np.array([255, 78, 125])
# (x,y) coordinates
coordinate_x = 0
coordinate_y = 0
for vertical_pixels in colored_mask:
coordinate_y = coordinate_y 1
#if coordinate_y > height:
# coordinate_y = 0
for pixels in vertical_pixels:
coordinate_x = coordinate_x 1
if coordinate_x > width:
coordinate_x = 1
# Current Pixel
np_pixels = np.array(pixels)
current_coordinate = np.array([[coordinate_x,coordinate_y]])
#print(current_coordinate)
if np.array_equal(np_pixels,right_head) or np.array_equal(np_pixels,left_head): # right head or left head
pixels_head = pixels_head 1
head_coordinates = np.concatenate((head_coordinates,current_coordinate),axis=0) # Save coordinates
if np.array_equal(np_pixels,torso): # Torso
torso_pixels = torso_pixels 1
torso_coordinates = np.concatenate((torso_coordinates,current_coordinate),axis=0) # Save coordinates
if np.array_equal(np_pixels,left_feet) or np.array_equal(np_pixels,right_feet): # feet_pixels
feet_pixels = feet_pixels 1
feet_coordinates = np.concatenate((feet_coordinates,current_coordinate),axis=0) # Save coordinates
if np.array_equal(np_pixels,left_arm_shoulder): # left_arm_shoulder
left_arm_shoulder_pixels = left_arm_shoulder_pixels 1
left_arm_shoulder_coordinates = np.concatenate((left_arm_shoulder_coordinates,current_coordinate),axis=0) # Save coordinates
if np.array_equal(np_pixels,right_arm_shoulder): # right_arm_shoulder
right_arm_shoulder_pixels = right_arm_shoulder_pixels 1
right_arm_shoulder_coordinates = np.concatenate((right_arm_shoulder_coordinates,current_coordinate),axis=0) # Save coordinates
The problem:
The problem with my approach, is that it's super slow! For instance, these lines of code:
if np.array_equal(np_pixels,torso): # Torso
Take a lot of execution time. It's too slow having to compare each pixel to it's RGB equivalent.
My question
What's the best solution? So either:
- There's a better way within the python-tf-bodypix library/API to get the segmented body parts' pixels' coordinates. (Anyone know if such method exists within the bodypix library?)
or...
Any better/faster approach to comparing two numpy arrays?
Any other inefficient code you see in my approach that I should change?
CodePudding user response:
From the answer: Finding the (x,y) indexes of specific (R,G,B) color values from images stored in NumPy ndarrays
The solution to your problem would be:
cords = list(zip(*np.where(np.all(np_pixels == torso, axis=-1))))
CodePudding user response:
You can use the fact that each of RGB triplet sum to a different value.
right_head = np.array([110, 64, 170]) # SUM = 344
left_head = np.array([143, 61, 178]) # SUM = 382
...
So you can sum your pixel value along the RGB dimension:
x = np.sum(colored_mask,axis=0)
And create a vector containing all the different possible sum, that correspond to a body part:
val = np.array([344,382,506,399,375,439,458]) # [right_head_sum, left_head_sum...]
Then compare those value by using broadcasting:
compare = x == val[:,None,None]
Count how many pixel you have in your 7 differents categories:
count = np.sum(compare,axis=(1,2))
And you can use np.where
and np.split
to retrieve the coordinate:
coord3d = np.where(compare)
split = np.where(np.diff(coord3d[0]))[0] 1
coord_x = np.split(coord3d[1],split)
coord_y = np.split(coord3d[2],split)
Example with a 3x5x5 image:
x = array([[375, 399, 458, 382, 506],
[375, 382, 506, 458, 382],
[439, 344, 382, 344, 375],
[439, 439, 344, 382, 344],
[382, 399, 506, 399, 382]])
val = array([344, 382, 506, 399, 375, 439, 458])
count = array([4, 7, 3, 3, 3, 3, 2]) # [right_head, left_head,...
coord_x = [array([2, 2, 3, 3]), # right_head
array([0, 1, 1, 2, 3, 4, 4]),# left_head
array([0, 1, 4]), # ...
array([0, 4, 4]),
array([0, 1, 2]),
array([2, 3, 3]),
array([0, 1]]
coord_y = [array([1, 3, 2, 4], # right_head
array([3, 1, 4, 2, 3, 0, 4], # left_head
array([4, 2, 2], # ...
array([1, 1, 3],
array([0, 0, 4],
array([0, 0, 1]),
array([2, 3]]
It should be way faster than your for loop.