Home > database >  Vectorized calculation of grouped metrics in tensorflow
Vectorized calculation of grouped metrics in tensorflow

Time:10-11

Using Tensorflow, I need to calculate multiple grouped metrics (e.g. mean) from a tensor, based on groups that are given by a second tensor.

A a toy example, let's say I got:
values = tf.constant([[1], [1.5], [2], [2.5], [3], [3.5]])
groupings = tf.constant([1,3,2,1,2,3])

And want to calculate:
group_means = [1.75,2.5,2.5])

I know how to calculate the mean for one group, e.g.:
group = tf.boolean_mask(values, tf.equal(groupings, i))
mean = tf.math.reduce_mean(group,axis=0)
and could do that for every group in a for-loop.

What I can't figure out is how to do that in a vectorized manner without looping through each group. Maybe it is a very easy question with an obvious solution, but I appreaciate any help.

CodePudding user response:

Try tf.math.unsorted_segment_mean:

import tensorflow as tf

values = tf.constant([[1], [1.5], [2], [2.5], [3], [3.5]])
groupings = tf.constant([1,3,2,1,2,3]) - 1
tf.math.unsorted_segment_mean(values, groupings, tf.shape(tf.unique_with_counts(groupings)[-1])[-1])
<tf.Tensor: shape=(3, 1), dtype=float32, numpy=
array([[1.75],
       [2.5 ],
       [2.5 ]], dtype=float32)>

Note that I subtract 1 from groupings, since it is easier when segment_ids start from 0.

  • Related