I am trying to convert an image from RBG to HED using the rgb2hed
function from skimage. My image is very big and if I just try and put the whole thing into the rgb2hed
function then I run out of memory. To get around this I have written some code to split the image into sections and apply rgb2hed
to each section, but I get very different results when doing so. Getting a matrix of almost all 0's when splitting up the function. Minimal reprex given below.
import numpy as np
from skimage.color import rgb2hed
np.random.seed(42)
sample = np.random.randint(0, 255, size=(100, 100, 3))
n_splits = 10
x_split_inds = np.linspace(0, sample.shape[0], n_splits 1, dtype=int)
for x in range(len(x_split_inds) - 1):
x_start = x_split_inds[x]
x_end = x_split_inds[x 1]
sample[x_start:x_end, :, :] = rgb2hed(sample[x_start:x_end, :, :])
print(rgb2hed(sample))
print()
print(sample)
This is the first matrix I get
[[[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
...
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]]
[[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
...
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]]
[[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
...
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]]
...
[[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
...
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]]
[[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
...
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]]
[[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
...
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]
[1.21016731 0. 0.88195046]]]
and this is the second
[[[1 0 0]
[1 0 0]
[1 0 0]
...
[1 0 0]
[1 0 0]
[1 0 0]]
[[1 0 0]
[1 0 0]
[1 0 0]
...
[1 0 0]
[1 0 0]
[1 0 0]]
[[1 0 0]
[1 0 0]
[1 0 0]
...
[1 0 0]
[1 0 0]
[1 0 0]]
...
[[1 0 0]
[1 0 0]
[1 0 0]
...
[1 0 0]
[1 0 0]
[1 0 0]]
[[1 0 0]
[1 0 0]
[1 0 0]
...
[1 0 0]
[1 0 0]
[1 0 0]]
[[1 0 0]
[1 0 0]
[1 0 0]
...
[1 0 0]
[1 0 0]
[1 0 0]]]
CodePudding user response:
The issue is related to dtype
mismatching:
- The
dtype
ofsample = np.random.randint(...)
is'int32'
- The
dtype
ofrgb2hed(...)
isfloat64
When updating a slice of NumPy array, the data is converted to the type of that updated NumPy array.
The expression sample[x_start:x_end, :, :] = rgb2hed(...)
automatically converts from float64
to 'int32'
type.
The cast from float64
to 'int32'
converts 1.21016731
to 1
and 0.88195046
to 0
...
We may solve it by avoiding "in place" processing:
output_sample = np.zeros((100, 100, 3), np.float64) # Allocate output_sample
...
output_sample[x_start:x_end, :, :] = rgb2hed(sample[x_start:x_end, :, :])
Does it require too much RAM?
We may also solve it by converting type of sample
to np.float64
:
import numpy as np
from skimage.color import rgb2hed
np.random.seed(42)
sample = np.random.randint(0, 255, size=(4, 4, 3)).astype(np.uint8)
sample_copy = sample.copy()
sample = sample.astype(np.float64) / 255.0 # Divide by 255, because rgb2hed expects range [0, 1] for float64 type
n_splits = 4 #10
x_split_inds = np.linspace(0, sample.shape[0], n_splits 1, dtype=int)
for x in range(len(x_split_inds) - 1):
x_start = x_split_inds[x]
x_end = x_split_inds[x 1]
sample[x_start:x_end, :, :] = rgb2hed(sample[x_start:x_end, :, :])
print(rgb2hed(sample_copy))
print()
print(sample)
Output of print(rgb2hed(sample_copy))
:
[[[-0.46103368 0.07251299 -0.31709355]
[-0.34690713 0.05235859 -0.33691324]
[-0.57147719 0.26278779 -0.31203784]
[-0.43647159 0.10270586 -0.43322983]]
...
Output of print(sample))
:
[[[-0.46103368 0.07251299 -0.31709355]
[-0.34690713 0.05235859 -0.33691324]
[-0.57147719 0.26278779 -0.31203784]
[-0.43647159 0.10270586 -0.43322983]]
Saving memory:
Each float64
elements is 8 bytes in RAM.
For saving memory space, we may use float32
type that is only 4 bytes in RAM.
Example: sample = np.random.randint(0, 255, size=(4, 4, 3)).astype(np.float32) / 255.0
...
For saving more RAM in expense of accuracy, we may use int16
type with scaling and rounding.
Example:
import numpy as np
from skimage.color import rgb2hed
np.random.seed(42)
sample = np.random.randint(0, 255, size=(4, 4, 3)).astype(np.uint8)
sample_copy = sample.copy()
sample = sample.astype(np.int16) # Convert to int16 (used for "in place" processing).
n_splits = 4 #10
x_split_inds = np.linspace(0, sample.shape[0], n_splits 1, dtype=int)
for x in range(len(x_split_inds) - 1):
x_start = x_split_inds[x]
x_end = x_split_inds[x 1]
# Scale by 10000 and convert to int16 with rounding and clipping:
sample[x_start:x_end, :, :] = np.round(rgb2hed(sample[x_start:x_end, :, :].astype(np.uint8))*10000).clip(-32768, 32767).astype(np.int16)
print(rgb2hed(sample_copy))
print()
print(sample*1e-4) # Divide by 10000
Output of print(sample*1e-4)
:
[[[-0.461 0.0725 -0.3171]
[-0.3469 0.0524 -0.3369]
[-0.5715 0.2628 -0.312 ]
[-0.4365 0.1027 -0.4332]]
As you can see, the accuracy is reduced to 4 decimal digits.