Predict an integer value [0, +oo[. With the activation functions, I use relu or linear but I didn't know how to output only integers. So I came up with my own (very simple) implementation.
Would it be useful to include this code in Keras? That's an open question. Also we could give a function pointer to have any function if Round is restrictive. For example
Custom.__init__(self, func_ptr, **kwargs)
Let me know.
from keras.layers import Layer
import keras.backend as K
class Round(Layer):
def __init__(self, **kwargs):
super(Round, self).__init__(**kwargs)
def get_output(self, train=False):
X = self.get_input(train)
return K.round(X)
def get_config(self):
config = {"name": self.__class__.__name__}
base_config = super(Round, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
m = Sequential()
m.add(Dense(20, input_shape=(max_features,)))
m.add(Activation('relu'))
m.add(Dense(20))
m.add(Activation('relu'))
m.add(Dense(3))
m.add(Round())
m.compile(loss='mean_absolute_error', optimizer='adam')
The gradient of the loss wrt. the parameters is always zero, cf. https://github.com/fchollet/keras/issues/2218#issuecomment-206875151
Would it be useful to include this code in Keras?
No.
Does this network actually train properly? It seems every layer before the Round() layer should never have their weights updated. Do the Dense layers' weights ever change, for example?
@carlthome Surprisingly the network could train with tensorflow 0.6.0. It does not work now on latest versions.
There is a way (you both are wrong or right at the same, time, decide :) )
Can this be implemented? How?
I'm having exactly this same problem. Currently testing using Lambda layers to do my type transformations and then recover the float32 type for training.
For people really needing to try this, here is a cost function:
def mean_rounding_loss(ytrue,ypred): #has derivative, that is equal to the nearest positive natural number
x=ytrue-ypred
a = K.round(K.abs(x))
return K.mean(a*(a-1)/4+a*x,axis=-1)
That does have a derivative and that derivative is equal to a difference to target integer number. It should do the same as old-school-perceptron-taught-things, where, if I remember correctly, the delta for backpropagation should have been rounded to the actual diff between predicted and target integers.
I needed that to test something, because all other cost functions were failing, but it did not help me.
@Darthholi thank you! Where did you get this formula from?
@Darthholi May I also ask you how did you came up with this function? I just tested this loss function; the model trained w/out error, but it seems that the loss is negative, if x is negative. Is there a fix for this issue? Thank you so much!
@Darthholi how to modify your code so I can round a number within (0, 1) to 0 or 1
Most helpful comment
For people really needing to try this, here is a cost function:
That does have a derivative and that derivative is equal to a difference to target integer number. It should do the same as old-school-perceptron-taught-things, where, if I remember correctly, the delta for backpropagation should have been rounded to the actual diff between predicted and target integers.
I needed that to test something, because all other cost functions were failing, but it did not help me.