I'm new to TensorFlow, and I was trying to understand how seed generation works (mainly with functions), my confusion is if I run this bit of code in my main program :
a = tf.random.uniform([1], seed=1)
b = tf.random.uniform([1], seed=1)
print(a)
print(b)
I get this output
tf.Tensor([0.2390374], shape=(1,), dtype=float32)
tf.Tensor([0.22267115], shape=(1,), dtype=float32)
two different tensors (obviously) but if I try to call it from a function :
@tf.function
def foo():
a = tf.random.uniform([1], seed=1)
b = tf.random.uniform([1], seed=1)
print(a)
print(b)
return a, b
foo()
I get this:
Tensor("random_uniform/RandomUniform:0", shape=(1,), dtype=float32)
Tensor("random_uniform_1/RandomUniform:0", shape=(1,), dtype=float32)
(<tf.Tensor: shape=(1,), dtype=float32, numpy=array([0.2390374], dtype=float32)>,
<tf.Tensor: shape=(1,), dtype=float32, numpy=array([0.2390374], dtype=float32)>)
There's two things I don't get:
- Why did the two prints inside the function show different output to the print from the return value of the function, in other words why was no value (numpy=array...) shown in the first two lines compared to the last two?
- Why are
a
andb
equal?
CodePudding user response:
Let's answer your first question
Why did the two prints inside the function show different output to the print from the return value of the function, in other words why was no value (numpy=array...) shown in the first two lines compared to the last two?
Eager Execution vs Graph Execution
At its core, Tensorflow works with tensor operations (ops). The expression tf.random.uniform([1], seed=1)
, for example, defines an op that, when executed, will generate a single random number with a seed of 1.
Now, you will need to actually execute this op to get a concrete random number. There are two ways to do that
- Execute this op right as soon as it is defined. This is called Eager Execution (EE), and is arguably the most intuitive and flexible way to execute an op. It is also the default way to execute ops in TF 2.x
- Compose this op and possibly other ops together into a static, optimized computational graph and use that optimized graph for every execution. This is called Graph Execution (GE). This was the default way to execute ops in TF 1.x, while in TF 2.x it is optional and can be enabled using the
@tf.function
decorator.
In EE context, the op tf.random.uniform([1], seed=1)
is executed as soon as it is defined and immediately returns a concrete value - an EagerTensor
object.
a = tf.random.uniform([1], seed=1)
print(type(a))
[Out]:
tensorflow.python.framework.ops.EagerTensor
While in GE, the op is simply a symbolic reference to a node in the optimized computational graph (a Tensor
object to be exact). Whenever you invoke the function, the op is actually executed, at which time it produces a concrete value - an EagerTensor
.
@tf.function
def genRandom():
a = tf.random.uniform([1], seed=1)
print(f'a type is {type(a)}')
return a
print(f'output type is {type(genRandom())}')
[Out]:
a type is <class 'tensorflow.python.framework.ops.Tensor'>
output type is <class 'tensorflow.python.framework.ops.EagerTensor'>
GE enables various static-graph optimizations to be performed to improve execution performance. EE, however, is far more intuitive and easier to debug as ops are executed on-the-fly instead of converted into cryptic graph nodes. A good practice is to debug functions in EE and wrap them in @tf.function
in final production to leverage speed advantages of GE. Be careful though, as some ops are allowed in EE context but not allowed or not available in GE context.
Graph Execution: Tracing
Going back to this example@tf.function
def genRandom():
a = tf.random.uniform([1], seed=1)
print(f'a is {a}')
return a
_ = genRandom()
[Out]:
a is Tensor("random_uniform/RandomUniform:0", shape=(1,), dtype=float32)
If in GE the op is only executed when the function is called, at which points it produces a concrete value, then why does Python print
inside the function show a graph node ref instead of the concrete value that a
produces?
This is because Python print
is not a TF op, nor is it convertible to one. Therefore, it won't be included in the computational graph and won't be executed. However, you did get a printout when running the function so what's happening here?
Turns out that TF constructs graph lazily. That is, only when a function decorated with @tf.function
is called for the first time does TF start constructing its graph. This is done by executing the function as normal Python code once to "record" the ops within the function - a process known as tracing - from which an optimized graph is constructed. After that, it is the constructed graph (and not the python code) who is called to produce the output.
In other words, Python's print
- which is not an op - will work for the first call thanks to the tracing procedure. However, once tracing is done and the graph is constructed, it is not included in the execution graph and won't be executed for subsequent calls:
@tf.function
def genRandom():
a = tf.random.uniform([1], seed=1)
print(f'a is {a}') # not an op, will not be included in graph and only print during tracing
return a
_ = genRandom()
[Out]:
a is Tensor("random_uniform/RandomUniform:0", shape=(1,), dtype=float32)
_ = genRandom()
[Out]:
To get the actual value when the graph is executed, you'll need to use tf.print()
which is a valid TF op that'll be included in the graph.
@tf.function
def genRandom():
a = tf.random.uniform([1], seed=1)
tf.print("a is:", a) # an op, will be included in graph and always print
return a
_ = genRandom()
[Out]:
a is: [0.239037395]
_ = genRandom()
[Out]:
a is: [0.222671151]
As a side note, if you call an already-graphed function with arguments of types different than ones the function saw when it's last traced, retracing may happen (because the current graph may not be usable for this new type), in which case the python code inside the function is executed again. Retracing is very expensive and should be avoided.
For the second question
Why are a and b equal in GE but not EE?
tf.random seeds
Now you know what GE and EE is, go through this doc. It explains everything you wish to know about what the seed
argument does and how ops internal representation differs in GE and EE mode. In short, the random number an tf.random.uniform()
op creates completely depends on three factors
- The global seed (which is set via
tf.random.set_seed
) - The op's own seed (which is passed to it constructor as the
seed
argument) - The op's internal counter, says
_count
, which counts the number of times the op has been executed._count
is reset to 0 everytimetf.random.set_seed
is called
In EE mode, identical ops share the same internal representation and thus your a
and b
share the same _count
. You get two different numbers because the value of _count
involved in generating them actually differs by 1.
In GE mode, each op has its own internal representation and thus a
and b
have their own _count
, which explains why you get the same number: everything involved in generating the two numbers are identical.