Caffe: Test accuracy cannot improve when learning ZFNet on ILSVRC12

Created on 23 Sep 2016 · 1Comment · Source: BVLC/caffe

Hi,

I've implemented an home-brewed ZFNet (prototxt) for my research. After 20k iterations with the definition, the test accuracy stays at ~0.001 (i.e., 1/1000), the test loss at ~6.9, and training loss at ~6.9, which seems that the net keeps playing guessing games among the 1k classes. I've thoroughly checked the whole definition and tried to change some of the hyper-parameters to start a new training, but of no avail, same results' shown on the screen....

Could anyone show me some light? Thanks in advance!

The hyper-parameters in the prototxt are derived from the paper [1]. All the inputs and outputs of the layers seems correct as Fig. 3 in the paper suggests.

The tweaks are:

crop-s of the input for both training and testing are set to 225 instead of 224 as discussed in #33;
one-pixel zero paddings for conv3, conv4, and conv5 to make the sizes of the blobs consistent [1];
filler types for all learnable layers changed from constant in [1] to gaussian with std: 0.01;
weight_decay: changing from 0.0005 to 0.00025 as suggested by @sergeyk in PR #33;

[1] Zeiler, M. and Fergus, R. Visualizing and Understanding Convolutional Networks, ECCV 2014.

Related Issue: #32, and PR: #33.

P.S.: For the poor part of the log, plz refer to here.

Source

stoneyang

>All comments

From https://github.com/BVLC/caffe/blob/master/CONTRIBUTING.md:

_Please do not post usage, installation, or modeling questions, or other requests for help to Issues._
Use the caffe-users list instead. This helps developers maintain a clear, uncluttered, and efficient view of the state of Caffe.