Keras: sparse_categorical_crossentropy requires a single output

Created on 26 Aug 2017  路  2Comments  路  Source: keras-team/keras

I'm working on a classification problem where the data is sparse (some classes have a very low rate of occurrence). When I train my model with loss=categorical_crossentropy, it works but the model just outputs the most frequently occurring class most of the time (which makes sense because of my loss function). So now I wanna try with loss=sparse_categorical_crossentropy because that sounds like something that would address this problem. But changing to sparse loss leads to an error about output dims.

Maybe I'm misunderstanding sparse_categorical_crossentropy and it's not a drop-in replacement for categorical_crossentropy?

Here's a minimal example:

model = Sequential()
model.add(Dense(2, input_shape=(None, 2)))
model.add(Activation('softmax'))

model.compile(loss='sparse_categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

inp = np.array([[[2, 3]]])
out = np.array([[[1, 0]]])

model.fit(inp, out)

>> ValueError: Error when checking target: expected activation_1 to have shape (None, None, 1) but got array with shape (1, 1, 2)

Note that in the above example: model.predict(inp).shape == out.shape. Therefore, it sounds like my shapes are correct but it still throws during fit.

stale

Most helpful comment

Hi
I think you're having some confusion with sparse categorical cross-entropy.
Sparse categorical cross-entropy consider label of shape [batch_size] (here 2) and each row of labels of shape [num_class_label].
categorical cross-entropy consider label of shape [batch_size, num_class_labels]
In above example, the output should be like:
out = np.array([[[[1,0], [0,1]]]]) //Sparse categorical cross-entropy
out = np.array([[[1, 0]]]) //categorical cross-entropy
if consider binary classification task.

All 2 comments

Hi
I think you're having some confusion with sparse categorical cross-entropy.
Sparse categorical cross-entropy consider label of shape [batch_size] (here 2) and each row of labels of shape [num_class_label].
categorical cross-entropy consider label of shape [batch_size, num_class_labels]
In above example, the output should be like:
out = np.array([[[[1,0], [0,1]]]]) //Sparse categorical cross-entropy
out = np.array([[[1, 0]]]) //categorical cross-entropy
if consider binary classification task.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

Was this page helpful?
0 / 5 - 0 ratings