I have a model with 2 categorical outputs.
The first output layer can predict 2 classes: [0, 1]
and the second output layer can predict 3 classes: [0, 1, 2].
Due to classes imbalance i would like to use class weights for each output
but whenever i add the class weights, the script fails with an error.
The script runs normally if the weights aren't added.
I've made a minimal example that reproduces the issue:
from tensorflow.python.keras.models import Model
from tensorflow.python.keras.layers import Input, Dense
from tensorflow.python.data import Dataset
import tensorflow as tf
import numpy as np
def preprocess_sample(features, labels):
label1, label2 = labels
label1 = tf.one_hot(label1, 2)
label2 = tf.one_hot(label2, 3)
return features, (label1, label2)
batch_size = 32
num_samples = 1000
num_features = 10
features = np.random.rand(num_samples, num_features)
labels1 = np.random.randint(2, size=num_samples)
labels2 = np.random.randint(3, size=num_samples)
train = Dataset.from_tensor_slices((features, (labels1, labels2))).map(preprocess_sample).batch(batch_size).repeat()
# Model
inputs = Input(shape=(num_features, ))
output1 = Dense(2, activation='softmax', name='output1')(inputs)
output2 = Dense(3, activation='softmax', name='output2')(inputs)
model = Model(inputs, [output1, output2])
model.compile(loss='categorical_crossentropy', optimizer='adam')
class_weights = {'output1': {0: 1, 1: 10}, 'output2': {0: 5, 1: 1, 2: 10}}
model.fit(train, epochs=10, steps_per_epoch=num_samples // batch_size,
# class_weight=class_weights
)
This scripts runs successfully.
But when you add the class weights by uncommenting the line # class_weight=class_weights
than the script crashes with the following error:
Traceback (most recent call last):
File "test.py", line 35, in <module>
class_weight=class_weights
File "venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1536, in fit
validation_split=validation_split)
File "venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 992, in _standardize_user_data
class_weight, batch_size)
File "venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1165, in _standardize_weights
feed_sample_weight_modes)
File "venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1164, in <listcomp>
for (ref, sw, cw, mode) in zip(y, sample_weights, class_weights,
File "venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_utils.py", line 717, in standardize_weights
y_classes = np.argmax(y, axis=1)
File "venv/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 1004, in argmax
return _wrapfunc(a, 'argmax', axis=axis, out=out)
File "venv/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 62, in _wrapfunc
return _wrapit(obj, method, *args, **kwds)
File "venv/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 42, in _wrapit
result = getattr(asarray(obj), method)(*args, **kwds)
numpy.core._internal.AxisError: axis 1 is out of bounds for array of dimension 1
I have the same problem on basic MNIST digit dataset.
Except that I have only one output, obviously.
Last layer of model:
model.add(layers.Dense(10, activation='softmax'))
And basic data to Tensors method (there is lot of non-essential options in the one I really use, these are the only relevant bits):
def data_to_tensors(dataset, categories = None):
labels = dataset['label']
data = dataset.drop(['label'], axis=1)
def to_float(value):
return value/255
data = data.apply(to_float).to_numpy()
labels = utils.to_categorical(labels,10)
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32)
dataset = dataset.repeat()
return dataset
I have investigated trouble a bit.
main reason is working with tensor y as with a simple array. You cannot use just np.argmax for tensor is this code snippet at tensorflow/python/keras/engine/training_utils.py
elif isinstance(class_weight, dict):
if len(y.shape) > 2:
raise ValueError('`class_weight` not supported for '
'3+ dimensional targets.')
if y.shape[1] > 1:
y_classes = np.argmax(y, axis=1)
elif y.shape[1] == 1:
y_classes = np.reshape(y, y.shape[0])
else:
y_classes = y
Same!
Neither sample_weights= nor class_weights= seem to accept a list of dictionaries as input. Despite sample_weight_mode= in model.fit() accepting a list so you can set different modes for multiple outputs. I thought maybe there would be issues with using sparse_categorical targets but I didn't even get that far =)
I had to write my own loss function wrapper and calculate weights separately and pass them to the model.
i am using a dictionary of dictionaries for class_weights
@eyaler Is using a dictionary of dictionaries for class_weight in a multi output tf.keras model actually working for you? If this works, could you please paste a snippet? Thanks.
@mmilosav working for me, however i am using keras proper (2.2.4 or 2.2.5) and not tf.keras
model.fit_generator(... class_weights={'name': {0: w1, 1: w2}})
where 'name' is also used for the relevant output layer and its loss (not sure which is important) in a multi output model
@eyaler Is using a dictionary of dictionaries for
class_weightin a multi outputtf.kerasmodel actually working for you? If this works, could you please paste a snippet? Thanks.
@mmilosav Did you find a solution for this in tf.keras?
I am finding that it doesn't accept a dictionary in TF 2.2.
Thanks
A workaround for TF2 is to use sample weights via the sample_weight parameter when calling model.fit().
This seems to accept a list of weights for each output, so you can compute class weights and then use them to generate sample weights for each task. It is similar to passing a dict of class weights in Keras 2.x.
Implementation of this condition is faulty. The Exception is being raised on any sequental (etc dict) outputs even if the dict contain just one output
Pretty embarrassing that this has been open for a year and a half now. Anyway, here's my solution for sparse categorical crossentropy for a Keras model with multiple outputs. I think it looks fairly clean but it might be horrifically inefficient, idk.
First create a dictionary where the key is the name set in the output Dense layers and the value is a 1D constant tensor. The value in index 0 of the tensor is the loss weight of class 0, a value is required for all classes present in each output even if it is just 1 or 0.
Compile your model with
model.compile(optimizer=optimizer,
loss={k: class_loss(v) for k, v in class_weights.items()})
where class_loss() is defined in the following manner
def class_loss(class_weight):
"""Returns a loss function for a specific class weight tensor
Params:
class_weight: 1-D constant tensor of class weights
Returns:
A loss function where each loss is scaled according to the observed class"""
def loss(y_obs, y_pred):
y_obs = tf.dtypes.cast(y_obs, tf.int32)
hothot = tf.one_hot(tf.reshape(y_obs, [-1]), depth=class_weight.shape[0])
weight = tf.math.multiply(class_weight, hothot)
weight = tf.reduce_sum(weight, axis=-1)
losses = tf.compat.v1.losses.sparse_softmax_cross_entropy(labels=y_obs,
logits=y_pred,
weights=weight)
return losses
return loss
If someone has a better suggestion than using tf.compat.v1 then please let me know. I don't feel confident that it will stick around through future versions of Tensorflow.
EDIT: Be aware that this is for an output with a linear output rather than a softmax output! You have to softmax the outputs afterwards if you want softmax values (but if you just want the predictions ranked then logits still work).
I am having the same problem, but i get some other error:
ValueError: Expectedclass_weightto be a dict with keys from 0 to one less than the number of classes, found {'output1': {0: 1.0, 1: 1.0, 2: 1.0, 3: 1.0, 4: 1.0, 5: 1.0, 6: 1.0, 7: 1.0}, 'output2': {0: 1.684420772303595, 1: 0.7110736368746486}}
this is my model.fit() code:
class_weights = {'output1': {0: 1.0, 1: 1.0, 2: 1.0, 3: 1.0, 4: 1.0, 5: 1.0, 6: 1.0, 7: 1.0}, 'output2': {0: 1.684420772303595, 1: 0.7110736368746486}}
model.fit(train_x, [train_age_y, train_gender_y], epochs=20, batch_size=32, validation_data=(test_x, [test_age_y, test_gender_y]),
class_weight=class_weights, verbose=1)
I Have defined the model like this:
`
def define_model():
img_input = Input(shape=(100, 100, 3))
layer1 = Conv2D(32, (3,3), padding='same', activation='relu')(img_input)
layer1 = BatchNormalization()(layer1)
layer1 = MaxPooling2D()(layer1)
layer1 = Dropout(0.2)(layer1)
layer2 = Conv2D(64, (3,3), padding='same', activation='relu')(layer1)
layer2 = BatchNormalization()(layer2)
layer2 = MaxPooling2D()(layer2)
layer2 = Dropout(0.2)(layer2)
layer3 = Conv2D(128, (3,3), padding='same', activation='relu')(layer2)
layer3 = BatchNormalization()(layer3)
layer3 = MaxPooling2D()(layer3)
layer3 = Dropout(0.3)(layer3)
layer4 = Conv2D(256, (3,3), padding='same', activation='relu')(layer3)
layer4 = BatchNormalization()(layer4)
layer4 = MaxPooling2D()(layer4)
layer4 = Dropout(0.3)(layer4)
flatten_gender = Flatten()(layer4)
layer5 = Conv2D(512, (3,3), padding='same', activation='relu')(layer4)
layer5 = BatchNormalization()(layer5)
layer5 = MaxPooling2D()(layer5)
layer5 = Dropout(0.3)(layer5)
layer6 = Conv2D(1024, (3,3), padding='same', activation='relu')(layer5)
layer6 = BatchNormalization()(layer6)
layer6 = MaxPooling2D()(layer6)
layer6 = Dropout(0.3)(layer6)
flatten_age = Flatten()(layer6)
output1 = Dense(8, name='output1', activation='softmax')(flatten_age)
output2 = Dense(1, name='output2', activation='sigmoid')(flatten_gender)
model = Model(inputs=img_input, outputs=[output1, output2])
model.compile(loss=['sparse_categorical_crossentropy', 'binary_crossentropy'], optimizer='adam')
print(model.summary())
return model
model = define_model()`
Can someone help me
@thisisdhruvagarwal Answering my own question. I used tensorflow 2.2.0 previously, i just uninstalled it and installed tensorflow 1.15.0, And the code worked like a charm!!
When I install tensorflow 2.1.0 it still works.
I experience the same issue wth tensorflow 2.2.0 and 2.3.0 with a (non sequential) Keras model. I compile the model like this:
model = keras.models.Model(
inputs=[inp1, inp2, inp3],
outputs=[output]
)
where output is:
output = keras.layers.Dense(1, activation="sigmoid", name="y")(x)
It works perfectly with tf 2.0.1.
I use official python3 docker containers in all cases.
@thisisdhruvagarwal See, I'm getting the multi-output error as well, but I don't have multiple outputs, but I do have multi classification.
model = Model( inputs = [sequence_input_head, sequence_input_body, semantic_feat, wordOL_feat], outputs = [preds])
model.compile(loss = 'categorical_crossentropy',
optimizer='adam',
metrics = ['accuracy'])
model.summary()
history = model.fit({'headline': hl_pd_tr, 'articleBody':bd_pd_train, 'semantic': semantic_x_tr, 'wordOverlap': wrd_OvLp_x_tr}, {'predic':y_train_cat},
epochs=100,
batch_size= BATCH__SIZE,
shuffle= True,
class_weight =class_weight_dict,
validation_data = ([hl_pd_val, bd_pd_val, semantic_x_val, wrd_OvLp_x_val], y_val_cat),
callbacks = [es] )
Most helpful comment
When I install tensorflow 2.1.0 it still works.