Keras: Cannot weight classes when using multiple outputs

Created on 26 Nov 2018 · 15Comments · Source: keras-team/keras

I have a model with 2 categorical outputs.
The first output layer can predict 2 classes: [0, 1]
and the second output layer can predict 3 classes: [0, 1, 2].

Due to classes imbalance i would like to use class weights for each output
but whenever i add the class weights, the script fails with an error.
The script runs normally if the weights aren't added.

I've made a minimal example that reproduces the issue:

from tensorflow.python.keras.models import Model
from tensorflow.python.keras.layers import Input, Dense
from tensorflow.python.data import Dataset
import tensorflow as tf
import numpy as np


def preprocess_sample(features, labels):
    label1, label2 = labels
    label1 = tf.one_hot(label1, 2)
    label2 = tf.one_hot(label2, 3)
    return features, (label1, label2)


batch_size = 32

num_samples = 1000
num_features = 10

features = np.random.rand(num_samples, num_features)
labels1 = np.random.randint(2, size=num_samples)
labels2 = np.random.randint(3, size=num_samples)

train = Dataset.from_tensor_slices((features, (labels1, labels2))).map(preprocess_sample).batch(batch_size).repeat()

# Model
inputs = Input(shape=(num_features, ))
output1 = Dense(2, activation='softmax', name='output1')(inputs)
output2 = Dense(3, activation='softmax', name='output2')(inputs)
model = Model(inputs, [output1, output2])

model.compile(loss='categorical_crossentropy', optimizer='adam')
class_weights = {'output1': {0: 1, 1: 10}, 'output2': {0: 5, 1: 1, 2: 10}}
model.fit(train, epochs=10, steps_per_epoch=num_samples // batch_size,
         #  class_weight=class_weights
          )

This scripts runs successfully.
But when you add the class weights by uncommenting the line # class_weight=class_weights
than the script crashes with the following error:

Traceback (most recent call last):
  File "test.py", line 35, in <module>
    class_weight=class_weights
  File "venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1536, in fit
    validation_split=validation_split)
  File "venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 992, in _standardize_user_data
    class_weight, batch_size)
  File "venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1165, in _standardize_weights
    feed_sample_weight_modes)
  File "venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1164, in <listcomp>
    for (ref, sw, cw, mode) in zip(y, sample_weights, class_weights,
  File "venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_utils.py", line 717, in standardize_weights
    y_classes = np.argmax(y, axis=1)
  File "venv/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 1004, in argmax
    return _wrapfunc(a, 'argmax', axis=axis, out=out)
  File "venv/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 62, in _wrapfunc
    return _wrapit(obj, method, *args, **kwds)
  File "venv/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 42, in _wrapit
    result = getattr(asarray(obj), method)(*args, **kwds)
numpy.core._internal.AxisError: axis 1 is out of bounds for array of dimension 1

tensorflow awaiting tensorflower

Source

GalAvineri

👍12

Most helpful comment

When I install tensorflow 2.1.0 it still works.

saksham189 on 30 Jul 2020

👍6

All 15 comments

I have the same problem on basic MNIST digit dataset.
Except that I have only one output, obviously.

Last layer of model:
model.add(layers.Dense(10, activation='softmax'))

And basic data to Tensors method (there is lot of non-essential options in the one I really use, these are the only relevant bits):

def data_to_tensors(dataset, categories = None):

    labels = dataset['label']
    data = dataset.drop(['label'], axis=1)

    def to_float(value):
        return value/255

    data = data.apply(to_float).to_numpy()

    labels = utils.to_categorical(labels,10)

    dataset = tf.data.Dataset.from_tensor_slices((data, labels))
    dataset = dataset.batch(32)
    dataset = dataset.repeat()
        return dataset

GSonderling on 17 May 2019

I have investigated trouble a bit.
main reason is working with tensor y as with a simple array. You cannot use just np.argmax for tensor is this code snippet at tensorflow/python/keras/engine/training_utils.py

 elif isinstance(class_weight, dict):
    if len(y.shape) > 2:
      raise ValueError('`class_weight` not supported for '
                       '3+ dimensional targets.')
    if y.shape[1] > 1:
      y_classes = np.argmax(y, axis=1)
    elif y.shape[1] == 1:
      y_classes = np.reshape(y, y.shape[0])
    else:
      y_classes = y

aleksartamonov on 4 Oct 2019

Same!

Neither sample_weights= nor class_weights= seem to accept a list of dictionaries as input. Despite sample_weight_mode= in model.fit() accepting a list so you can set different modes for multiple outputs. I thought maybe there would be issues with using sparse_categorical targets but I didn't even get that far =)

I had to write my own loss function wrapper and calculate weights separately and pass them to the model.

grofte on 20 Nov 2019

i am using a dictionary of dictionaries for class_weights

eyaler on 2 Dec 2019

👎2

@eyaler Is using a dictionary of dictionaries for class_weight in a multi output tf.keras model actually working for you? If this works, could you please paste a snippet? Thanks.

mmilosav on 5 Apr 2020

@mmilosav working for me, however i am using keras proper (2.2.4 or 2.2.5) and not tf.keras

model.fit_generator(... class_weights={'name': {0: w1, 1: w2}})
where 'name' is also used for the relevant output layer and its loss (not sure which is important) in a multi output model

eyaler on 5 Apr 2020

👍2

@eyaler Is using a dictionary of dictionaries for class_weight in a multi output tf.keras model actually working for you? If this works, could you please paste a snippet? Thanks.

@mmilosav Did you find a solution for this in tf.keras?

I am finding that it doesn't accept a dictionary in TF 2.2.

Thanks

shervin64 on 13 May 2020

👍4

A workaround for TF2 is to use sample weights via the sample_weight parameter when calling model.fit().

This seems to accept a list of weights for each output, so you can compute class weights and then use them to generate sample weights for each task. It is similar to passing a dict of class weights in Keras 2.x.

shervin64 on 22 May 2020

Implementation of this condition is faulty. The Exception is being raised on any sequental (etc dict) outputs even if the dict contain just one output

danielbraun89 on 1 Jun 2020

Pretty embarrassing that this has been open for a year and a half now. Anyway, here's my solution for sparse categorical crossentropy for a Keras model with multiple outputs. I think it looks fairly clean but it might be horrifically inefficient, idk.

First create a dictionary where the key is the name set in the output Dense layers and the value is a 1D constant tensor. The value in index 0 of the tensor is the loss weight of class 0, a value is required for all classes present in each output even if it is just 1 or 0.

Compile your model with

model.compile(optimizer=optimizer,
              loss={k: class_loss(v) for k, v in class_weights.items()})

where class_loss() is defined in the following manner

def class_loss(class_weight):
  """Returns a loss function for a specific class weight tensor

  Params:
    class_weight: 1-D constant tensor of class weights

  Returns:
    A loss function where each loss is scaled according to the observed class"""
  def loss(y_obs, y_pred):
    y_obs = tf.dtypes.cast(y_obs, tf.int32)
    hothot = tf.one_hot(tf.reshape(y_obs, [-1]), depth=class_weight.shape[0])
    weight = tf.math.multiply(class_weight, hothot)
    weight = tf.reduce_sum(weight, axis=-1)
    losses = tf.compat.v1.losses.sparse_softmax_cross_entropy(labels=y_obs,
                                                              logits=y_pred,
                                                              weights=weight)
    return losses
  return loss

If someone has a better suggestion than using tf.compat.v1 then please let me know. I don't feel confident that it will stick around through future versions of Tensorflow.

EDIT: Be aware that this is for an output with a linear output rather than a softmax output! You have to softmax the outputs afterwards if you want softmax values (but if you just want the predictions ranked then logits still work).

grofte on 10 Jun 2020

👍1

I am having the same problem, but i get some other error:

ValueError: Expectedclass_weightto be a dict with keys from 0 to one less than the number of classes, found {'output1': {0: 1.0, 1: 1.0, 2: 1.0, 3: 1.0, 4: 1.0, 5: 1.0, 6: 1.0, 7: 1.0}, 'output2': {0: 1.684420772303595, 1: 0.7110736368746486}}

this is my model.fit() code:

class_weights = {'output1': {0: 1.0, 1: 1.0, 2: 1.0, 3: 1.0, 4: 1.0, 5: 1.0, 6: 1.0, 7: 1.0}, 'output2': {0: 1.684420772303595, 1: 0.7110736368746486}} model.fit(train_x, [train_age_y, train_gender_y], epochs=20, batch_size=32, validation_data=(test_x, [test_age_y, test_gender_y]), class_weight=class_weights, verbose=1)

I Have defined the model like this:

`
def define_model():
img_input = Input(shape=(100, 100, 3))

layer1 = Conv2D(32, (3,3), padding='same', activation='relu')(img_input)
layer1 = BatchNormalization()(layer1)
layer1 = MaxPooling2D()(layer1)
layer1 = Dropout(0.2)(layer1)

layer2 = Conv2D(64, (3,3), padding='same', activation='relu')(layer1)
layer2 = BatchNormalization()(layer2)
layer2 = MaxPooling2D()(layer2)
layer2 = Dropout(0.2)(layer2)

layer3 = Conv2D(128, (3,3), padding='same', activation='relu')(layer2)
layer3 = BatchNormalization()(layer3)
layer3 = MaxPooling2D()(layer3)
layer3 = Dropout(0.3)(layer3)

layer4 = Conv2D(256, (3,3), padding='same', activation='relu')(layer3)
layer4 = BatchNormalization()(layer4)
layer4 = MaxPooling2D()(layer4)
layer4 = Dropout(0.3)(layer4)

flatten_gender = Flatten()(layer4)

layer5 = Conv2D(512, (3,3), padding='same', activation='relu')(layer4)
layer5 = BatchNormalization()(layer5)
layer5 = MaxPooling2D()(layer5)
layer5 = Dropout(0.3)(layer5)

layer6 = Conv2D(1024, (3,3), padding='same', activation='relu')(layer5)
layer6 = BatchNormalization()(layer6)
layer6 = MaxPooling2D()(layer6)
layer6 = Dropout(0.3)(layer6)

flatten_age = Flatten()(layer6)

output1 = Dense(8, name='output1', activation='softmax')(flatten_age)
output2 = Dense(1, name='output2', activation='sigmoid')(flatten_gender)

model = Model(inputs=img_input, outputs=[output1, output2])
model.compile(loss=['sparse_categorical_crossentropy', 'binary_crossentropy'], optimizer='adam')

print(model.summary())
return model

model = define_model()`
Can someone help me

thisisdhruvagarwal on 21 Jul 2020

👀1 😕1

@thisisdhruvagarwal Answering my own question. I used tensorflow 2.2.0 previously, i just uninstalled it and installed tensorflow 1.15.0, And the code worked like a charm!!

thisisdhruvagarwal on 21 Jul 2020

When I install tensorflow 2.1.0 it still works.

saksham189 on 30 Jul 2020

👍6

I experience the same issue wth tensorflow 2.2.0 and 2.3.0 with a (non sequential) Keras model. I compile the model like this:

model = keras.models.Model( inputs=[inp1, inp2, inp3], outputs=[output] )

where output is:
output = keras.layers.Dense(1, activation="sigmoid", name="y")(x)

It works perfectly with tf 2.0.1.

I use official python3 docker containers in all cases.

noise-field on 4 Aug 2020

👍1

@thisisdhruvagarwal See, I'm getting the multi-output error as well, but I don't have multiple outputs, but I do have multi classification.

model = Model( inputs = [sequence_input_head, sequence_input_body, semantic_feat, wordOL_feat], outputs = [preds])
model.compile(loss = 'categorical_crossentropy',
             optimizer='adam',
             metrics = ['accuracy'])
model.summary()

history = model.fit({'headline': hl_pd_tr, 'articleBody':bd_pd_train, 'semantic': semantic_x_tr, 'wordOverlap': wrd_OvLp_x_tr},  {'predic':y_train_cat},
                    epochs=100,
                    batch_size= BATCH__SIZE,
                    shuffle= True,
                    class_weight =class_weight_dict, 
                    validation_data = ([hl_pd_val, bd_pd_val, semantic_x_val, wrd_OvLp_x_val], y_val_cat),
                      callbacks = [es] )