I have latest version of TF and latest pull of inception model installed.
I am trying to train inception from Scratch!
Note: So Initially I have 2000 images of 2.5megapixel size, divided into 2 labels.
Note: I don't have bounding boxes in image data set.
Step1:- I shraded my image data set of about 2000 images into TFRecord files successfully, using ::
bazel-bin/inception/build_image_data --train_directory="/home/airig/scratch/img_data_set/train_img" --validation_directory="/home/airig/scratch/img_data_set/validation_img" --output_directory="/home/airig/scratch/img_data_set/out_dir" --labels_file="/home/airig/scratch/img_data_set/gro_labels.txt"
Finished writing all 2000 images in data set.
Step2:- I ran
bazel-bin/inception/imagenet_train --num_gpus=1 --batch_size=16 --train_dir=/home/airig/scratch/img_data_set/train_dir --data_dir=/home/airig/scratch/img_data_set/out_dir
but Now I am getting a new error:--
airig@airig-Inspiron-7559:~/scratch/tensorflow/models/inception$ bazel-bin/inception/imagenet_train --num_gpus=1 --batch_size=16 --train_dir=/home/airig/scratch/img_data_set/train_dir --data_dir=/home/airig/scratch/img_data_set/out_dir
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so.8.0 locally
E tensorflow/core/framework/op_kernel.cc:925] OpKernel ('op: "NegTrain" device_type: "CPU"') for unknown op: NegTrain
E tensorflow/core/framework/op_kernel.cc:925] OpKernel ('op: "Skipgram" device_type: "CPU"') for unknown op: Skipgram
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
Traceback (most recent call last):
File "/home/airig/scratch/tensorflow/models/inception/bazel-bin/inception/imagenet_train.runfiles/inception/inception/imagenet_train.py", line 41, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/home/airig/scratch/tensorflow/models/inception/bazel-bin/inception/imagenet_train.runfiles/inception/inception/imagenet_train.py", line 37, in main
inception_train.train(dataset)
File "/home/airig/scratch/tensorflow/models/inception/bazel-bin/inception/imagenet_train.runfiles/inception/inception/inception_train.py", line 216, in train
num_preprocess_threads=num_preprocess_threads)
File "/home/airig/scratch/tensorflow/models/inception/bazel-bin/inception/imagenet_train.runfiles/inception/inception/image_processing.py", line 136, in distorted_inputs
num_readers=FLAGS.num_readers)
File "/home/airig/scratch/tensorflow/models/inception/bazel-bin/inception/imagenet_train.runfiles/inception/inception/image_processing.py", line 491, in batch_inputs
image = image_preprocessing(image_buffer, bbox, train, thread_id)
File "/home/airig/scratch/tensorflow/models/inception/bazel-bin/inception/imagenet_train.runfiles/inception/inception/image_processing.py", line 326, in image_preprocessing
image = distort_image(image, height, width, bbox, thread_id)
File "/home/airig/scratch/tensorflow/models/inception/bazel-bin/inception/imagenet_train.runfiles/inception/inception/image_processing.py", line 224, in distort_image
tf.image_summary('image_with_bounding_boxes', image_with_box)
AttributeError: 'module' object has no attribute 'image_summary'
Can you please point out whats wrong here? something I missed?
@anuj2rock
'module' object has no attribute 'image_summary' ---->Please switch to tf.summary.image .
'concat (from tensorflow.python.ops.array_ops)' ------>switch to tf.concat_v2()
after fix this, i am getting a new error:
I tensorflow/stream_executor/dso_loader.cc:120] successfully opened CUDA library libcublas.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:120] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:120] successfully opened CUDA library libcufft.so.7.5 locally
I tensorflow/stream_executor/dso_loader.cc:120] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:120] successfully opened CUDA library libcurand.so.7.5 locally
Traceback (most recent call last):
File "train_image_classifier.py", line 585, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "train_image_classifier.py", line 482, in main
clones = model_deploy.create_clones(deploy_config, clone_fn, [batch_queue])
File "/home/twisted/git/tensorflow/models/slim/deployment/model_deploy.py", line 195, in create_clones
outputs = model_fn(args, *kwargs)
File "train_image_classifier.py", line 466, in clone_fn
logits, end_points = network_fn(images)
File "/home/twisted/git/tensorflow/models/slim/nets/nets_factory.py", line 105, in network_fn
return func(images, num_classes, is_training=is_training)
File "/home/twisted/git/tensorflow/models/slim/nets/inception_v3.py", line 481, in inception_v3
depth_multiplier=depth_multiplier)
File "/home/twisted/git/tensorflow/models/slim/nets/inception_v3.py", line 161, in inception_v3_base
net = tf.concat_v2(3, [branch_0, branch_1, branch_2, branch_3])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1057, in concat_v2
dtype=dtypes.int32).get_shape(
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 651, in convert_to_tensor
as_ref=False)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 716, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/constant_op.py", line 176, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/constant_op.py", line 165, in constant
tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_util.py", line 367, in make_tensor_proto
_AssertCompatible(values, dtype)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_util.py", line 302, in _AssertCompatible
(dtype.name, repr(mismatch), type(mismatch).__name__))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.
How do you solve it?
@czd2003 I tried TF slim way of training from scratch. Use this as guide i hope it helps.
@czd2003 Also, in case you succeed in training and evaluation (like me) , please share if .pb graph file got generated at end or not?
Changes
File "lib\tensorflow_models\slim\preprocessing\inception_preprocessing.py", line 195, in preprocess_for_train
tf.image_summary('image_with_bounding_boxes', image_with_box)
AttributeError: module 'tensorflow' has no attribute 'image_summary'
LINE 195,203
FROM
tf.image_summary
TO
tf.summary.image
Most helpful comment
Changes
File "lib\tensorflow_models\slim\preprocessing\inception_preprocessing.py", line 195, in preprocess_for_train
tf.image_summary('image_with_bounding_boxes', image_with_box)
AttributeError: module 'tensorflow' has no attribute 'image_summary'
LINE 195,203
FROM
tf.image_summary
TO
tf.summary.image