Home > other >  Have a great god help me, training resnet50, tensorflow always remind there is an error.
Have a great god help me, training resnet50, tensorflow always remind there is an error.

Time:09-22

When using a cifar10 and resnet50 training model

Run the following program,
Python train_image_classifier. Py \
- train_dir=cifar10/train_dir \
- dataset_name=cifar10 \
- dataset_split_name="train" \
- dataset_dir=cifar10/data \
- model_name=resnet_v2_50 \
-- checkpoint_path=pretrained/resnet_v2_50. CKPT \
- checkpoint_exclude_scopes=resnet_v2_50/logits \
- max_number_of_steps=50 \
- the batch_size=16 \
- learning_rate=0.001 \
- log_every_n_steps=100 \
- the optimizer=Adam
This is feedback
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO: tensorflow: Restoring the parameters from cifar10 train_dir/model. The CKPT - 62
I0308 18:30:32. 481478 140012097914688 saver. Py: 1280] Restoring the parameters from cifar10 train_dir/model. The CKPT - 62
INFO: tensorflow: Error reported to the Coordinator: & lt; Class 'tensorflow. Python. Framework. Errors_impl. InvalidArgumentError' & gt; , Restoring the from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Both Please ensure that you have not altered the graph expected -based on the checkpoint, the Original error:

Always assign a device for operation resnet_v2_50/Pad: the node resnet_v2_50/Pad (defined at/home/belle/xun/slim/nets/resnet_utils py: 122) was explicitly assigned to/device: the GPU: 0, but the available devices are [/task/job: localhost/up: 0:0/device: CPU: 0,/job: localhost/up: 0/task: 0/device: XLA_CPU: 0,/job: localhost/up: 0/task: 0/device: XLA_GPU: 0]. Make sure the device specification refers to a valid device.
[[resnet_v2_50/Pad]]

Errors have originated from an input operation.
The Input Source operations connected to the node resnet_v2_50/Pad:
Fifo_queue_Dequeue (defined at train_image_classifier. Py: 488)
I0308 18:30:32. 733913 140012097914688 coordinator. Py: 224] the Error reported to the coordinator: & lt; Class 'tensorflow. Python. Framework. Errors_impl. InvalidArgumentError' & gt; , Restoring the from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Both Please ensure that you have not altered the graph expected -based on the checkpoint, the Original error:

Always assign a device for operation resnet_v2_50/Pad: the node resnet_v2_50/Pad (defined at/home/belle/xun/slim/nets/resnet_utils py: 122) was explicitly assigned to/device: the GPU: 0, but the available devices are [/task/job: localhost/up: 0:0/device: CPU: 0,/job: localhost/up: 0/task: 0/device: XLA_CPU: 0,/job: localhost/up: 0/task: 0/device: XLA_GPU: 0]. Make sure the device specification refers to a valid device.
[[resnet_v2_50/Pad]]

Errors have originated from an input operation.
The Input Source operations connected to the node resnet_v2_50/Pad:
Fifo_queue_Dequeue (defined at train_image_classifier. Py: 488)
Traceback (the most recent call last) :
The File "/home/belle/local/lib/python3.6/site - packages/tensorflow/python/client/session. Py", line 1356, in _do_call
Return fn (* args)
The File "/home/belle/local/lib/python3.6/site - packages/tensorflow/python/client/session. Py", line 1339, in _run_fn
Self. _extend_graph ()
The File "/home/belle/local/lib/python3.6/site - packages/tensorflow/python/client/session. Py", line 1374, in _extend_graph
Tf_session. ExtendSession (self. _session)
Tensorflow. Python. Framework. Errors_impl. InvalidArgumentError: always assign a device for operation resnet_v2_50/Pad: {{node resnet_v2_50/Pad}} was explicitly assigned to/device: the GPU: 0, but the available devices are [/task/job: localhost/up: 0:0/device: CPU: 0,/job: localhost/up: 0/task: 0/device: XLA_CPU: 0,/job: localhost/up: 0/task: 0/device: XLA_GPU: 0]. Make sure the device specification refers to a valid device.
[[resnet_v2_50/Pad]]
,,,,,,,,,
E0308 18:30:33. 083234 140012097914688 tf_should_use. Py: 71]==================================
The Object was never 2 (type & lt; The class 'tensorflow. Python. Framework. Ops. Tensor' & gt;) :

If you want to mark it as 2 call its "mark_used ()" method.
It was originally created here:
The File "train_image_classifier. Py", line 608, in & lt; module>
Tf. The app. The run () the File "/home/belle/local/lib/python3.6/site - packages/tensorflow/python/platform/app. Py", line 40, in the run
_run (main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/home/belle/local/lib/python3.6/site - packages/absl/app. Py", line 321, in the run
Raise the File "/home/belle/local/lib/python3.6/site - packages/absl/app. Py", line 250, in _run_main
Sys. Exit (main (argv) File "train_image_classifier. Py, line 604, the main in
Sync_optimizer=optimizer if FLAGS. Sync_replicas else None) File "/home/belle/local/lib/python3.6/site - packages/tensorflow/contrib/slim/python/slim/learning py", line 796, "train" in
nullnullnullnullnullnullnullnullnull
  • Related