Keras: why is sample_weight restricted to 1D?

Created on 19 Apr 2016  路  6Comments  路  Source: keras-team/keras

I have tried to ask this twice with no response.

Could you please explain the reason why sample_weight is restricted to 1D when you have 2D target?

In multi-label case (where each label is independent of each other), you want to weight each sample of each label differently. For instance, if one image is labeled to have multiple targets.

What is the best workaround without resorting to a graph model with multiple outputs for each label?

Most helpful comment

@carlthome Sorry for re-posting multiple times. But, it seemed no one was responding.
@fchollet: I read the docs and so much is clear as to what sample weights are.

However, in the case where you have a multi-label learning task (each label is a binary and hence NOT one-hot-encoded, e.g.: 1 0 0 1 1) and some of the labels have few positive examples and I would like to weight the losses of positive and negative samples of each label separately.

Thanks for your help!

All 6 comments

Don't repost the same issue several times. It makes it harder for users to find good answers when searching. https://github.com/fchollet/keras/issues/2337

Could you please explain the reason why sample_weight is restricted to 1D when you have 2D target?

sample_weight is an array of weights with a 1:1 mapping to samples. One sample, one weight. It's simple enough.

If you are working with sequence data, you can also use timestep weighting (one weight per timestep per sample), via a 2D weight array. This is covered in the docs.

@carlthome Sorry for re-posting multiple times. But, it seemed no one was responding.
@fchollet: I read the docs and so much is clear as to what sample weights are.

However, in the case where you have a multi-label learning task (each label is a binary and hence NOT one-hot-encoded, e.g.: 1 0 0 1 1) and some of the labels have few positive examples and I would like to weight the losses of positive and negative samples of each label separately.

Thanks for your help!

Hi, glad to have an another use case. I think this issue is the inflexibility of the objective function as I replied to #369 . I came up with a temporal solution in my case - by simply considering the label weight as another input along with x. And define a scalar prediction that combines the weights and your original outputs. Then compile the model with:
model.compile(loss=lambda y_pred, p_true: K.mean(y_pred * p_true), optimizer='adam')
In fitting step. I just pass a 1-array as the p_true, which work fine for me.

I can't tell why this feature request was closed, but I'd like to add another vote for it. It would be really really useful to have a mode where the outputs, labels, and weights are all 2D matrices. That's the normal case for multitask networks. You predict many values for each sample, and each of those values gets weighted independently when computing the loss. This can be used to remove missing data, to fix unbalanced data, or for various other purposes.

What about adding an option sample_weight_mode='multitask' for this case?

Note that outputs and labels can potentially have more than two dimensions. For example, in a multitask classifier they'd have shape (n_samples, n_tasks, n_classes). But the weights will always have shape (n_samples, n_tasks).

Was this page helpful?
0 / 5 - 0 ratings