I am trying to train a inception_resnet_v2 on another dataset, when I test the accuracy of the model.
with slim.arg_scope(inception_resnet_v2.inception_resnet_v2_arg_scope()):
logits, _ = inception_resnet_v2.inception_resnet_v2(images,
num_classes=dataset.num_classes,
is_training=True)
setting is_training with True give better result on validation set, but from the code, i should have set this to False
@sguada I think this problem maybe with batch_norm during training and testing, there should be a better way to solve this
I find its strange in _get_variables_to_train, I print out all the variable names, but in batch_norm, only beta like 'InceptionV3/Mixed_6b/Branch_2/Conv2d_0a_1x1/BatchNorm/beta:0' is included
It's really confusig when I try fine-tuning, can someone explain this?
When is_training=True then batch_norm uses the statistics of the batch, when it is False it uses the moving average of the statistics computed during training.
When doing fine-tuning sometimes it is better not update the moving average of the statistics, since you are not going to run for long. So try training with is_training=False and testing with also is_training=False.
@sguada , after I update my tf and cuda,cudnn, the problem goes away, so a suggestion maybe document your tf version and cuda, cudnn version in slim. Tf is now being developped by many and a bit complicated to debug.
And my tensorflow version is ed87884e50e1a50f7dc7b36dc7a7ff225442bee0, cuda 8.0, cudnn 5.1
@sguada , in the slim implemention, you apply moving avg to all the model parameters(maybe just need for inception models ?), but in another implemention of resnet under tensorflow/models, they just apply to batch norm parameters.
Automatically closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!
I still have the same issue, 2 years later
I still have the same issue, 2 years later
I have the same issue, 3 years later...
@sguada I think this problem maybe with batch_norm during training and testing, there should be a better way to solve this
I print out all values, But'Ture' and 'False' are same.
@sguada I think this problem maybe with batch_norm during training and testing, there should be a better way to solve this
I print out all values, But'Ture' and 'False' are same.
I have the same issue, 3year later. And I also print all values, they are all the same. However the result of True and False are extremely different. When is_training=False, no matter what is the input, the results are all 0 or 1.
Most helpful comment
When is_training=True then batch_norm uses the statistics of the batch, when it is False it uses the moving average of the statistics computed during training.
When doing fine-tuning sometimes it is better not update the moving average of the statistics, since you are not going to run for long. So try training with is_training=False and testing with also is_training=False.