The "Dense" layer has parameters "kernel_initializer", "kernel_regularizer" and "kernel_constraint" which in my opinion are quite confusing names. I have not seen the weights in a dense layer being referred to as "kernel". The docs say "Initializer for the kernel weights matrix". I assume these parameters are named for consistency with the convolutional kernels and changing them might be tricky (though I'm not sure if consistency with the convolutional layers is really a good criterion here).
But I think the documentation could be a bit more clear. If the docs just said "weight matrix" instead of "kernel weight matrix" I feel that would be easier to understand.
A layer's "weights" include both the features matrix and the bias vector.
We needed a precise way to distinguish between kernel and bias. A way that
would be shared across all layer types. We settled for "kernel/bias", which
is canonical in conv layers, and was sometimes used in dense layers even
before this decision was made.
Also note that dense layers are a special case of conv layer with a window
that is the size of the input. It is thus as correct to refer to a "kernel"
in one case as it is in the other.
This notation is used throughout Keras and TensorFlow, and thus is
recognized by a supermajority of the deep learning community. It's
effectively canonical.
On Sun, Apr 22, 2018, 19:04 Andreas Mueller notifications@github.com
wrote:
The "Dense" layer has parameters "kernel_initializer",
"kernel_regularizer" and "kernel_constraint" which in my opinion are quite
confusing names. I have not seen the weights in a dense layer being
referred to as "kernel". The docs say "Initializer for the kernel weights
matrix". I assume these parameters are named for consistency with the
convolutional kernels and changing them might be tricky (though I'm not
sure if consistency with the convolutional layers is really a good
criterion here).But I think the documentation could be a bit more clear. If the docs just
said "weight matrix" instead of "kernel weight matrix" I feel that would be
easier to understand.—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/10010, or mute the thread
https://github.com/notifications/unsubscribe-auth/AArWb7cQQT1ybrxs80GGZS7W0tSYCqneks5trTaogaJpZM4TfI2t
.
Thanks for your reply. If it's used in keras and tensorflow I guess it is now canonical (I have not seen it in a deep learning paper though). Having done deep learning before either existed, it's really foreign to me. In the language I am used to, weights did not include the biases. I guess tensorflow and keras decided a different nomenclature. "Kernel" is a very overloaded term in ML and I feel this usage is not helping.
Most helpful comment
Thanks for your reply. If it's used in keras and tensorflow I guess it is now canonical (I have not seen it in a deep learning paper though). Having done deep learning before either existed, it's really foreign to me. In the language I am used to, weights did not include the biases. I guess tensorflow and keras decided a different nomenclature. "Kernel" is a very overloaded term in ML and I feel this usage is not helping.