Tensor split to dynamic length tensors based on continuous mask values in tensorflow?-CodePudding

I'm trying to figure out how to split my tensor of sequential data into multiple parts based on partitioning continuous masks with value of binary number '1'.

I've read the official documentation. Howerver I can't find any function that can handle this easy. Any helpful ways for this in python?

I have tried with 'tf.ragged.boolean_mask' but it doesn't seem to fit in my case.

The visualized example of my explanation is:

inputs:

# both are tensors, NOT data.
data_tensor = ([3,5,6,2,6,1,3,9,5])
mask_tensor = ([0,1,1,1,0,0,1,1,0])

expected output:

output_tensor = ([[3],[5,6,2],[6,1],[3,9],[5]])

Thank you.

CodePudding user response：

I recently discovered a method to do it in a very clean way in this answer by @AloneTogether:

import tensorflow as tf

data_tensor = tf.constant([3,5,6,2,6,1,3,9,5])
mask_tensor = tf.constant([0,1,1,1,0,0,1,1,0])

# Index where the mask changes.
change_idx = tf.concat([tf.where(mask_tensor[:-1] != mask_tensor[1:])[:, 0], [tf.shape(mask_tensor)[0]-1]], axis=0)

# Ranges of indices to gather.
ragged_idx = tf.ragged.range(tf.concat([[0], change_idx[:-1]   1], axis=0), change_idx   1)

# Gather ranges into ragged tensor.
output_tensor = tf.gather(data_tensor, ragged_idx)

print(output_tensor)

<tf.RaggedTensor [[3], [5, 6, 2], [6, 1], [3, 9], [5]]>