Using Tensorflow, I need to calculate multiple grouped metrics (e.g. mean) from a tensor, based on groups that are given by a second tensor.
A a toy example, let's say I got:
values = tf.constant([[1], [1.5], [2], [2.5], [3], [3.5]])
groupings = tf.constant([1,3,2,1,2,3])
And want to calculate:
group_means = [1.75,2.5,2.5])
I know how to calculate the mean for one group, e.g.:
group = tf.boolean_mask(values, tf.equal(groupings, i))
mean = tf.math.reduce_mean(group,axis=0)
and could do that for every group in a for-loop.
What I can't figure out is how to do that in a vectorized manner without looping through each group. Maybe it is a very easy question with an obvious solution, but I appreaciate any help.
CodePudding user response:
Try tf.math.unsorted_segment_mean
:
import tensorflow as tf
values = tf.constant([[1], [1.5], [2], [2.5], [3], [3.5]])
groupings = tf.constant([1,3,2,1,2,3]) - 1
tf.math.unsorted_segment_mean(values, groupings, tf.shape(tf.unique_with_counts(groupings)[-1])[-1])
<tf.Tensor: shape=(3, 1), dtype=float32, numpy=
array([[1.75],
[2.5 ],
[2.5 ]], dtype=float32)>
Note that I subtract 1 from groupings
, since it is easier when segment_ids
start from 0.