for multi-outputs graph models, do we already support per-output sample and class weights?
Update:
I just saw class weights docstring says it supports a dict. But sample weights doesn't say that
I'm pretty sure _both_ support dicts _and_ lists (would have to check the
code to verify)...
On 3 May 2016 at 11:51, Eder Santana [email protected] wrote:
for multi-outputs graph models, do we already support per-output sample
and class weights?Update:
I just saw class weights docstring says it supports a dict. But sample
weights doesn't—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
https://github.com/fchollet/keras/issues/2592
@EderSantana and @fchollet I confirm that both support dict.
I currently have ported out my old graph code to the functional api with multiple outputs for multi-label attribute recognition and both class and sample weights work just like they worked with the old Graph API.
However, I asked a couple of times for support for allowing 2D weights for samples other than resorting to 3D target vector. I have a strong use case for it. predicting multiple binary attributes while optimizing a joint loss function with sample weights is currently not possible without using multiple outputs.
Each binary outcome in multi-label binary learning is a class on its own, meaning, each output has two classes, hence, you will need to weight positive samples and negative samples differently for multi-label output. With current Keras class and sample weight restrictions the only way to achieve this is to have multiple outputs with one neuron each instead of the normal one output with multiple neurons. The other less than attractive option is to convert the target vector to 3D duplicating for zero and one as separate targets. Both workarounds are too inconvenient.
Would it be difficult to support 2D weights for each example in 2D target cases? You can call them anything you want, they don't necessary be called sample weights. They can be a third weight input to the train loop.
I am also looking forward to this improvement in keras!
In fact, it is common to meet multiple binary classification problems. Positive and negative samples are often not balanced, and it is necessary to weight samples by both sample and label.
I have the same problem as @esube. The output consists of 10 binary labels and I would like to change the individual contributions to the overall binary_crossentropy loss, which at the moment is summing up the losses as I understand it. Is there a way to change the weights of each binary task without creating 10 seperate outputs and using loss_weights in compile?
A year later and I have the same problem as @esube and @DeepVoltaire. Are there any plans to implement such a method?
@DeepVoltaire I think you need loss_weights in your model compile arguments
@HarisIqbal88 loss weights only works if you have multiple outputs but the question was to have only one output with 10 neurons. By the way I found a solution by writing my own custom loss function which depends on 2d weights: https://stackoverflow.com/questions/48485870/multi-label-classification-with-class-weights-in-keras/48700950#48700950 .
That way you can have one one output with multiple neurons and assign each output neuron 2 weights for back/signal
@dennis-ec Thanks for the functions. I understand what calculating_class_weights does but I'm not sure I understand get_weighted_loss. If y_true is a list of true labels of length num_obs, and weights[:,0] is the 0 weights for all classes and is of length num_classes, what does weights[:,0]**(1-y_true) give you?
def get_weighted_loss(weights):
def weighted_loss(y_true, y_pred):
return K.mean((weights[:,0]**(1-y_true))*(weights[:,1]**(y_true))*K.binary_crossentropy(y_true, y_pred), axis=-1)
return weighted_loss
@Runze y_true has the shape (num_obs, num_classes). The resulted object of weights[:,0]**(1-y_true) has the shape (num_obs, num_classes). The entries are either 1 (if the y_true entry was signal and therefore the background weight is not needed or the corresponding background weight) This is also the shape which is returned by K.binary_crossentropy. So the output of your loss function is multiplied with either the weight for background or for signal (one of them is always one) this is done for each class in each sample independently. K.Mean() then calculate the mean over the classes and your output is a loss vector with shape (num_obs).
@dennis-ec thank you. I understand now.
Most helpful comment
@Runze y_true has the shape (num_obs, num_classes). The resulted object of
weights[:,0]**(1-y_true)has the shape (num_obs, num_classes). The entries are either 1 (if the y_true entry was signal and therefore the background weight is not needed or the corresponding background weight) This is also the shape which is returned by K.binary_crossentropy. So the output of your loss function is multiplied with either the weight for background or for signal (one of them is always one) this is done for each class in each sample independently. K.Mean() then calculate the mean over the classes and your output is a loss vector with shape (num_obs).