opencv_dnn provides incorrect inferences after transform_graph
I am running into a problem that has been encountered by a few people mentioned in Pull Request #9517.
I basically trained a MobileNet using Tensorflow's retrain.py
as described in HackerNoon post Creating insanely fast image classifiers with MobileNet in TensorFlow.
Once the imagery is setup, the network is trained as follows:
#!/bin/bash -xe
TF_ROOT=/home/ubuntu/src/tensorflow/tensorflow
DATA_ROOT=/home/ubuntu/data
python $TF_ROOT/examples/image_retraining/retrain.py \
--image_dir $DATA_ROOT \
--learning_rate=0.001 \
--testing_percentage=20 \
--validation_percentage=20 \
--train_batch_size=32 \
--validation_batch_size=-1 \
--flip_left_right \
--random_crop=30 \
--random_scale=30 \
--random_brightness=30 \
--eval_step_interval=100 \
--how_many_training_steps=2000 \
--architecture mobilenet_1.0_224
Then graph is then transformed using Tensorflow's transform_graph
tool:
~/Development/tensorflow/bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph=mobilenet_1.0_224.pb \
--out_graph=deploynet_1.0_224.pb \
--inputs=input \
--outputs=final_result \
--transforms="fold_constants sort_by_execution_order remove_nodes(op=Squeeze, op=PlaceholderWithDefault)"
The summarize_graph
tensorflow tool shows the following output:
Found 1 possible inputs: (name=input, type=float(1), shape=[1,224,224,3])
No variables spotted.
Found 1 possible outputs: (name=final_result, op=Softmax)
Found 4235007 (4.24M) const parameters, 0 (0) variable parameters, and 0 control_edges
Op types used: 86 Const, 28 Add, 27 Mul, 27 Relu6, 15 Conv2D, 13 DepthwiseConv2dNative, 1 AvgPool, 1 BiasAdd, 1 Identity, 1 MatMul, 1 Placeholder, 1 Reshape, 1 Softmax
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorflow/tools/benchmark:benchmark_model -- --graph=/home/wlucas/Temp/dnn/deploynet_1.0_224.pb --show_flops --input_layer=input --input_layer_type=float --input_layer_shape=1,224,224,3 --output_layer=final_result
Testing the inference engine after these adjustments generally shows a 99% confidence of the second class regardless of the correct class presented (with the exception of training imagery); whereas, Tensorflow Mobile correctly infers the image as expected.
Any thoughts or help would be greatly appreciated!
@Will, is it possible to put a reference to a final model? Thanks!