I have a dataset of tf.RaggedTensor
s with strings representing hexadecimal numbers that look like this:
[
[[b'F6EE', b'BFED', b'4EEA', b'00EE', b'77AE', b'1FBE', b'1A6E',
b'5AEB', b'6A0E', b'212F'],
...
[b'FFEE', b'FFED', b'FEED', b'FDEE', b'FAAE', b'FFBE', b'FA8E',
b'FAEB', b'FA0E', b'E12F']],
...
[[b'FFEE', b'FFED', b'FEED', b'FDEE', b'FAAE', b'FFBE', b'FA8E',
b'FAEB', b'FA0E', b'E12F'],
...
[b'B6EE', b'BFED', b'4EEA', b'00EE', b'77AE', b'1FBE', b'1A6E',
b'5AEB', b'6A0E', b'212F']]
]
I want to convert it into Tensor of int values, but tf.strings.to_number(tensor, tf.int32)
doesn't have an option to specify the base as base16
. Are there any alternatives?
Dataset contains tf.RaggedTensor
s, but the target shape is (batch_size, 100, 10)
. I guess this could be helpful if we were to make a custom function for this.
CodePudding user response:
I think you're looking for something like this.
I first create an example tensor with 3D shape, as the one that you have.
import tensorflow as tf
>> a = tf.convert_to_tensor(['F6EE', 'BFED', '4EEA', '00EE', '77AE', '1FBE', '1A6E',
'5AEB', '6A0E', '212F'])
>> b = tf.convert_to_tensor(['FFEE', 'FFED', 'FEED', 'FDEE', 'FAAE', 'FFBE', 'FA8E',
'FAEB', 'FA0E', 'E12F'])
>> tensor = tf.ragged.stack([[a, b]]).to_tensor()
tf.Tensor(
[[[b'F6EE' b'BFED' b'4EEA' b'00EE' b'77AE' b'1FBE' b'1A6E' b'5AEB'
b'6A0E' b'212F']
[b'FFEE' b'FFED' b'FEED' b'FDEE' b'FAAE' b'FFBE' b'FA8E' b'FAEB'
b'FA0E' b'E12F']]], shape=(1, 2, 10), dtype=string)
Then, based on this answer, I created a custom function that I map to each value of the tensor in order to apply a transformation, in this case a cast.
def my_cast(t):
val = tf.keras.backend.get_value(t)
return int(val, 16)
shape = tf.shape(tensor)
elems = tf.reshape(tensor, [-1])
res = tf.map_fn(fn=lambda t: my_cast(t), elems=elems, fn_output_signature=tf.int32)
res = tf.reshape(res, shape)
print(res)
The output is the tensor:
tf.Tensor(
[[[63214 49133 20202 238 30638 8126 6766 23275 27150 8495]
[65518 65517 65261 65006 64174 65470 64142 64235 64014 57647]]],
shape=(1, 2, 10),
dtype=int32
)
Adding fn_output_signature=tf.int32
to tf.map_fn
is important because it lets you obtain a tensor with a different type with respect to the input tensor.