I want to have retrieval of GCS object/any S3 object as a part of the Model, as a first layer which will obtain features based on the filename, because it will lower the networking overhead, and I am trying to wrap the download into the tf.function
, but no success.
Here is MWE:
import tensorflow as tf
@tf.function
def load_file(a):
if tf.is_tensor(a):
a_path = tf.strings.substr(a, 0, 2) "/" a
else:
a_path = a[0:2] "/" a
with tf.io.gfile.GFile("gs://some_bucket" a_path) as f:
return f.read()
load_file(tf.constant("file3"))
which raises error
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In [22], line 9
7 with tf.io.gfile.GFile("gs://some_bucket" a_path) as f:
8 return f.read()
----> 9 load_file(tf.constant("file3"))
File /opt/conda/envs/wanna-hmic/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py:153, in filter_traceback.<locals>.error_handler(*args, **kwargs)
151 except Exception as e:
152 filtered_tb = _process_traceback_frames(e.__traceback__)
--> 153 raise e.with_traceback(filtered_tb) from None
154 finally:
155 del filtered_tb
File /opt/conda/envs/wanna-hmic/lib/python3.9/site-packages/tensorflow/python/framework/func_graph.py:1147, in func_graph_from_py_func.<locals>.autograph_handler(*args, **kwargs)
1145 except Exception as e: # pylint:disable=broad-except
1146 if hasattr(e, "ag_error_metadata"):
-> 1147 raise e.ag_error_metadata.to_exception(e)
1148 else:
1149 raise
TypeError: in user code:
File "/tmp/ipykernel_4006/3877294148.py", line 8, in load_file *
return f.read()
TypeError: __init__(): incompatible constructor arguments. The following argument types are supported:
1. tensorflow.python.lib.io._pywrap_file_io.BufferedInputStream(filename: str, buffer_size: int, token: tensorflow.python.lib.io._pywrap_file_io.TransactionToken = None)
Invoked with: <tf.Tensor 'add_2:0' shape=() dtype=string>, 524288
the code works well in eager mode with load_file("file3")
but in order to perform well, I need it to work even in the graph mode.
CodePudding user response:
tf.io.read_file
does the trick.
Modifying the whole code to
@tf.function
def load_file(a):
a = tf.convert_to_tensor(a)
a_path = tf.strings.substr(a, 0, 2) "/" a
return tf.io.read_file("gs://some_bucket" a_path)
makes it work in both eager and graph environment, and makes the input variable consistently a tf tensor.