Using tf.where (or np.where) to draw randomly conditional on an input-CodePudding

I have a TensorFlow vector that only contains 1s and 0s, like a = [0, 0, 0, 1, 0, 1], and conditional on the value of a, I want to draw new random values 0 or 1. If the value of a is 1, I want to draw a new value but if the value of a is 0 I want to leave it alone. So I've tried this:

import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions

# random draw of zeros and ones
a = tfd.Binomial(total_count = 1.0, probs = 0.5).sample(6)

which gives me <tf.Tensor: shape=(6,), dtype=float32, numpy=array([0., 0., 0., 1., 0., 1.], dtype=float32)> then if I redraw

# redraw with a different probability if value is 1. in the original draw
b = tf.where(a == 1.0, tfd.Binomial(total_count = 1., probs = 0.5).sample(1), a)

I would expect tf.where to give me a new vector b that has, on average, half of the 1s become 0s but instead it either returns a copy of a or a vector of all 0s. Example output would be one of b = [0, 0, 0, 0, 0, 0], b = [0, 0, 0, 0, 0, 1], b = [0, 0, 0, 1, 0, 0], or b = [0, 0, 0, 1, 0, 1] . I could of course just use b = tfd.Binomial(total_count = 1.0, probs = 0.25).sample(6) but in my particular case the order of the original vector matters.

A more general situation might use a different distribution so that bit-wise operations can't be easily used. For example

# random draw of normals
a = tfd.Normal(loc = 0., scale = 1.).sample(6)
# redraw with a different probability if value is greater than zero in the original draw
b = tf.where(a > 0, tfd.Normal(loc = 0., scale = 1.).sample(1), a)

CodePudding user response：

APPROACH 1:

Not tested, but I think the middle param should be a tensor that matches the original one. E.g. 6 elements:

First, make a second random sequence, of same length:

a2 = tfd.Binomial(total_count = 1.0, probs = 0.5).sample(6)

Then:

b = tf.where(a == 1.0, a2, 0.0)

Explanation:

The values in a2 are irrelevant where a is 0, and are 50-50 on average where a is 1.

APPROACH 2:

If that doesn't work, then first param needs to be mapped to a tensor of [true, false, ..]:

def pos(n):
    return n > 0

cond = list(map(pos,a))

b = tf.where(cond, a2, 0.0)

APPROACH 3:

Tested. Doesn't use tf.where.

First, make a second random sequence, of same length:

a2 = tfd.Binomial(total_count = 1.0, probs = 0.5).sample(6)

Then combine the two, "bitwise-and"ing corresponding elements:

def and2(a, b):
    return (a & b)

b = list(map(and2, a, a2))

Example data:

a = [0,0,1,1]
a2 = [0,1,0,1]

Result:

b = [0,0,0,1]

Explanation:

The values in a2 are irrelevant where a is 0, and are 50-50 on average where a is 1.

CodePudding user response：

You could try something like this, which will create a new random value (1.0 or 0.0) for every position in a where the value is 1.0 and then update a with those new values. If a does not contain any 1.0 values, it remains the same:

import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions

a = tfd.Binomial(total_count = 1.0, probs = 0.5).sample(6)
indices = tf.where(tf.equal(a, 1.0))

new_values, indices = tf.cond(tf.not_equal(tf.shape(indices)[0], 0),
                     lambda: (tf.squeeze(tf.stack([tfd.Binomial(total_count = 1.0, probs = 0.5).sample(1) for _ in tf.range(tf.shape(indices)[0])]), axis=1), indices),
                     lambda: (a, tf.range(tf.shape(a)[0])))
print('a before -->', a)
print('indices -->', indices)
a = tf.tensor_scatter_nd_update(a, indices, new_values)
print('a after -->',a)

a before --> tf.Tensor([1. 1. 0. 1. 0. 0.], shape=(6,), dtype=float32)
indices --> tf.Tensor(
[[0]
 [1]
 [3]], shape=(3, 1), dtype=int64)
a after --> tf.Tensor([0. 0. 0. 1. 0. 0.], shape=(6,), dtype=float32)