Home > Back-end >  tf.io.GFile with Tensor String Input
tf.io.GFile with Tensor String Input

Time:01-09

I want to have retrieval of GCS object/any S3 object as a part of the Model, as a first layer which will obtain features based on the filename, because it will lower the networking overhead, and I am trying to wrap the download into the tf.function, but no success. Here is MWE:

import tensorflow as tf
@tf.function
def load_file(a):
    if tf.is_tensor(a):
        a_path = tf.strings.substr(a, 0, 2)   "/"   a
    else:
        a_path = a[0:2]   "/"   a
    with tf.io.gfile.GFile("gs://some_bucket"   a_path) as f:
        return f.read()
load_file(tf.constant("file3"))

which raises error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [22], line 9
      7     with tf.io.gfile.GFile("gs://some_bucket"   a_path) as f:
      8         return f.read()
----> 9 load_file(tf.constant("file3"))

File /opt/conda/envs/wanna-hmic/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py:153, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    151 except Exception as e:
    152   filtered_tb = _process_traceback_frames(e.__traceback__)
--> 153   raise e.with_traceback(filtered_tb) from None
    154 finally:
    155   del filtered_tb

File /opt/conda/envs/wanna-hmic/lib/python3.9/site-packages/tensorflow/python/framework/func_graph.py:1147, in func_graph_from_py_func.<locals>.autograph_handler(*args, **kwargs)
   1145 except Exception as e:  # pylint:disable=broad-except
   1146   if hasattr(e, "ag_error_metadata"):
-> 1147     raise e.ag_error_metadata.to_exception(e)
   1148   else:
   1149     raise

TypeError: in user code:

File "/tmp/ipykernel_4006/3877294148.py", line 8, in load_file  *
    return f.read()

TypeError: __init__(): incompatible constructor arguments. The following argument types are supported:
    1. tensorflow.python.lib.io._pywrap_file_io.BufferedInputStream(filename: str, buffer_size: int, token: tensorflow.python.lib.io._pywrap_file_io.TransactionToken = None)

Invoked with: <tf.Tensor 'add_2:0' shape=() dtype=string>, 524288

the code works well in eager mode with load_file("file3") but in order to perform well, I need it to work even in the graph mode.

CodePudding user response:

tf.io.read_file does the trick. Modifying the whole code to

@tf.function
def load_file(a):
    a = tf.convert_to_tensor(a)
    a_path = tf.strings.substr(a, 0, 2)   "/"   a
    return tf.io.read_file("gs://some_bucket"   a_path)

makes it work in both eager and graph environment, and makes the input variable consistently a tf tensor.

  • Related