Cannot use Tensorflow model with batch normalization [closed]

asked 2019-08-20 03:31:16 -0600

updated 2019-08-20 03:54:27 -0600

berak
32993 ●7 ●81 ●312

I have a simple convolution network model made with Keras and Tensorflow 1.14 The model is saved as constant graph in binary .pb format

The model loads successfully but the calculations are not correct after the first batch norm layer

I am using OpenCV 3.4

Anyone encountered or heard a similar problem?

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by Nikolai Tasev
close date 2019-08-25 03:43:40.211350

Comments

If you really want to go for deep learning in OpenCV, i so suggest using latest master 4.x branch. There are like daily fixes on these things, so 3.4 will probably be heavily outdated...

StevenPuttemans ( 2019-08-20 05:00:20 -0600 )edit

i'm getting similar problems with pytorch->onnx->dnn with 4.1.0.

a simple conv/bn/relu/pool is highly inaccurate with a bn in it, and ok with bn removed.

https://gist.github.com/berak/43ad415...

solved my problem:

model.eval() needs to be called before saving the onnx, to put it from "train" ito "evaluation" mode, similar to "freezing" a tf network.

berak ( 2019-08-20 06:48:26 -0600 )edit

I also have bn as a second layer after conv. The outputs are quite different in OpenCV comparing to Tensorflow. There was something strange. I saw the Keras bn layer is done by several nodes in Tensorflow but in OpenCV I see only one layer named fused_batchnorm.

Nikolai Tasev ( 2019-08-20 07:02:10 -0600 )edit

Feel free to open an issue providing steps to reproduce it (attach the model). We observed several times buggy Keras batch normalization - it does not switch between training nd testing mode properly. So if the latest master or the latest 3.4 branches produce wrong results - let's investigate if together without woodoo debugging but with reproducible reports. Thanks!

dkurt ( 2019-08-20 07:09:14 -0600 )edit

Will do. I will have to verify which version I am using and gather the relevant information.

Nikolai Tasev ( 2019-08-20 07:39:24 -0600 )edit

I think there is already an issue on the topic here. Have you frozen the graph_def file like described in the comment?

paubau ( 2019-08-20 08:34:57 -0600 )edit

Yes I used tf.keras.backend.set_learning_phase(0) before loading the model from the keras saved file then used tf.graph_util.convert_variables_to_constants(...)

Nikolai Tasev ( 2019-08-21 06:01:59 -0600 )edit

Added the first part of the model (up to the batchnorm) and some test data in the issue https://github.com/opencv/opencv/issu...

Nikolai Tasev ( 2019-08-21 06:03:09 -0600 )edit

I found the problem with the help of dkurt. Seems the BatchNorm was configured wrongly for channels first data format instead of channels last.

Nikolai Tasev ( 2019-08-25 03:41:18 -0600 )edit

add a comment

Cannot use Tensorflow model with batch normalization [closed]

Closed for the following reason the question is answered, right answer was accepted by Nikolai Tasev
close date 2019-08-25 03:43:40.211350

Comments

Links

Question Tools

Stats

Related questions

Cannot use Tensorflow model with batch normalization [closed] edit

Closed for the following reason the question is answered, right answer was accepted by Nikolai Tasev close date 2019-08-25 03:43:40.211350

Comments

Links

Question Tools

Stats

Related questions

Cannot use Tensorflow model with batch normalization [closed]

Closed for the following reason the question is answered, right answer was accepted by Nikolai Tasev
close date 2019-08-25 03:43:40.211350