@davidsandberg
In the center_loss function there is one tensor named centers that has the dimension nrof_features, i.e. one center per feature.
I've a little confused. In "A Discriminative Feature Learning Approach for Deep Face Recognition", Wen et al. say "In each iteration, the centers are computed by averaging the features of the corresponding classes". So in my opinion, there should be one center for every class, instead of every feature. And in "3.3 Discussion", when they visualize the distribution of features on MNIST Dataset, they use 10 centers as 10 average values of 10 digits. I think there must be something wrong with my understanding. Would you please answer my question?
This is very strange for me too. I think that 'centers' should have dimmension
The idea is that each class have his own centers and the CenterLoss make example of same class similar to other.
When we have only one 'center', it look like that we want that each class should we similar to each other.
But maybe this is because of not understanding the code of TensorFlow.
@melgor yes, you're absolutely right. It can be seen as each class having one center, and each center having the dimension nrof_features.
I have started training using a new center loss based on some code found here.
Will see how it works, but it will probably require some parameter tuning.
The new center loss looks like this:
def center_loss_new(logits, label, alfa, nrof_classes):
nrof_features = logits.get_shape()[1]
centers = tf.get_variable('centers', [nrof_classes, nrof_features], dtype=tf.float32,
initializer=tf.constant_initializer(0), trainable=False)
label = tf.reshape(label, [-1])
logits_batch = tf.gather(centers, label)
diff = (1 - alfa) * (logits_batch - logits)
centers = tf.scatter_sub(centers, label, diff)
centers_batch = tf.gather(centers, label)
loss = tf.nn.l2_loss(logits - centers_batch)
return loss, centers
Three more things about CenterLoss, based on Caffe implementation:
xavier or any other random initialization.nrof_features and number of classes). You used much smaller (1e-5) (but you still get very good results > 98%, which is great). @davidsandberg Hello, have you started training with the new center loss? Would you please share your parameter values such as learning_rate, decov_loss_factor and center_loss_factor with me?
I did some training, but sadly it didn't improve the performance compared to using the old center loss. The parameter settings that gave the best performance was
weight_decay = 2e-4
center_loss = 2e-4
Will check the center loss implementation a bit more when I get some time to spare.
The new implementation looks ok and there is a unit test (https://github.com/davidsandberg/facenet/blob/master/test/center_loss_test.py) to verify it. The new implementation has been committed to master.
Most helpful comment
Three more things about CenterLoss, based on Caffe implementation:
xavieror any other random initialization.nrof_featuresandnumber of classes). You used much smaller (1e-5) (but you still get very good results > 98%, which is great).