Convert Pandas DB of hex colors to numpy 3d array for cv2-CodePudding

I have a file that has X lines of Y hex codes, like this:

FFFFFF 123456 453623 ....
354352 5AB12A 123789 ....
...... ...... ...... ....

My final goal is to transform it in Python into OpenCV image a.k.a. NumPy array. I am also using pandas read_table as it is faster than reading via python file. So now I have Pandas DB of hexes and i apply some transformations on it:

data = pd.read_table('ИРС/sample/0', sep=' ', header=None, dtype=str)
data = data.iloc[:,:-1]
data = data.applymap(lambda x: int(x, 16))
data = data.applymap(lambda x: np.array([x>>16, x>>8, x]).astype(np.uint8))

Now, when I convert it into NumPU array, via data.to_numpy(), and run cv2.cvtColor(img, cv2.COLOR_BGR2HSV), I get following error:

>  - src data type = 17 is not supported

Which suggest that I am using signed ints in my array, but that is not the case.

My question is, how do I convert it, and am I doing Pandas to cv2 convertion right?

CodePudding user response：

I suggest you to convert the DataFrame to NumPy array after converting the data to integers, and continue the conversion with NumPy.

You may use the following stages:

Convert from Hex to int.

 data = data.applymap(lambda x: int(x, 16))

Convert to NumPy array of type np.uint32 and make it contiguous (because the next operation ".view(np.uint8)" requires contiguous data).
```
 data = np.ascontiguousarray(data.to_numpy(np.uint32))
```
Use .view(np.uint8) - each uint32 element is viewed as 4 uint8 elements.
The data format applies RGBA-like format (the 4'th 0 element may be considered to be the alpha channel - the 4'th element should be removed).
```
 data = data.view(np.uint8).reshape((data.shape[0], data.shape[1], 4))
```
Convert from RGBA to BGR using OpenCV (assume the input represents is RGB).
```
 img = cv2.cvtColor(data, cv2.COLOR_RGBA2BGR)
```

Complete code sample:

import pandas as pd
import numpy as np
import cv2

data = pd.read_table('0.txt', sep=' ', header=None, dtype=str)

# data:
#         0       1       2
# 0  FFFFFF  123456  453623
# 1  354352  5AB12A  123789

#data = data.iloc[:,:-1]

data = data.applymap(lambda x: int(x, 16)) # Convert from hex to int
# data:
#           0        1        2
# 0  16777215  1193046  4535843
# 1   3490642  5943594  1193865

# data = data.applymap(lambda x: np.array([x>>16, x>>8, x]).astype(np.uint8))

# Convert to NumPy array of type np.uint32, make it contiguous, because the next operation ".view(np.uint8)" requires contiguous data.
data = np.ascontiguousarray(data.to_numpy(np.uint32))
# data:
# array([[16777215,  1193046,  4535843],
#        [ 3490642,  5943594,  1193865]], dtype=uint32)

# Use .view(np.uint8) - each uint32 element is viewed as 4 uint8 elements.
# The data format applies RGBA-like format (the 4'th 0 element may be considered to be alpha).
data = data.view(np.uint8).reshape((data.shape[0], data.shape[1], 4))
# data:
# array([[[255, 255, 255,   0], [ 86,  52,  18,   0], [ 35,  54,  69,   0]],
#        [[ 82,  67,  53,   0], [ 42, 177,  90,   0], [137,  55,  18,   0]]], dtype=uint8)

# Convert from RGBA to BGR
img = cv2.cvtColor(data, cv2.COLOR_RGBA2BGR)
# img:
# array([[[255, 255, 255], [ 18,  52,  86], [ 69,  54,  35]],
#       [[ 53,  67,  82], [ 90, 177,  42], [ 18,  55, 137]]], dtype=uint8)

# Show image for testing
#cv2.imshow('img', img)
#cv2.waitKey()
#cv2.destroyAllWindows()

Note:

The trick of using data.view(np.uint8).reshape may be unintuitive.
It's also fine to convert each uint32 element to three uint8 elements using shift operations.