The following script fails with Keras 1.0.4, but worked with 1.0.3:
from keras.layers import Dense, Activation
from keras.models import Sequential
model = Sequential([
Dense(32, input_dim=2),
Activation('relu'),
Dense(10),
Activation('softmax'),
])
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd')
model.fit([[0,1], [1,1], [1,0]], [1,2,3])
gives the following exception:
Exception: Error when checking model target: expected activation_6 to have shape (None, 10) but got array with shape (3, 1)
The check doesn't seem to take into account the sparse categorical entropy loss (which should only take one integer target per training example). This has been tested with both Tensorflow and Theano backend.
Please make sure that the boxes below are checked before you submit your issue. Thank you!
Your code runs for me. Have you tried syncing to the master branch?
Are you by any chance using windows? I experience the same problem when i use it on my windows10 machine even with the most recent master branch. However, using my Linux machine your code runs just fine. Anyone have any suggestions?
Most likely that's an installation problem on your window machine. There's
no reason this code wouldn't run if you are actually running the latest
master branch.
On Jun 19, 2016 8:34 PM, "JohnnyRisk" [email protected] wrote:
Are you by any chance using windows? I experience the same problem when i
use it on my windows10 machine even with the most recent master branch.
However, using my Linux machine your code runs just fine. Anyone have any
suggestions?—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/fchollet/keras/issues/3009#issuecomment-227043893,
or mute the thread
https://github.com/notifications/unsubscribe/AArWbw8ppznZBt4baQDX8kpHmin8u2vIks5qNgo0gaJpZM4I4lqr
.
This was tested on Ubuntu with a fresh venv install and pip install keras==1.0.4
.
However I just ran with master and this same code runs fine with it. Reverting back to the pypi 1.0.4 release still throws the above exception though.
I'll close this issue for the time being. Many thanks!
Hi,
I've synced to the latest master branch (Keras 1.0.5) and get exactly the same exception.
I'm running the code in a conda environment with the Tensorflow backend (on Mac OS X).
I'm running the following (basically the same as above)
```X_train = np.array([[1,2], [6,5], [8,2]])
y_train = np.array([2,3,7])
input_dim = X_train.shape[1]
model = Sequential()
model.add(Dense(output_dim=64, input_dim=input_dim))
model.add(Activation("relu"))
model.add(Dense(output_dim=10))
model.add(Activation("softmax"))
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.fit(X_train, y_train, nb_epoch=5, batch_size=32)
```
The exception I am getting is
Exception: Error when checking model target: expected activation_2 to have shape (None, 10) but got array with shape (3, 1)
I'm synced with the master branch, but have also tried this with Keras 1.0.3, neither works. Do you have any idea what this issue could be stemming from?
Cheers
whats up @EmilienDupont try sparse_categorical_crossentropy
and reshaping y_train = y_train.reshape((-1, 1))
Cheers @lukedeo, that did the trick 👍
I'm new to Python and Keras, but am running into something similar with my code – what does a shape of (None, 10)
actually look like? Would that not just be a basic list with 10 elements?
@EmilienDupont could you explain what you did? I also have the same problem.
@damianhinch
In the model above, the output at the end is 10-dimensional (Dense(output_dim=10)
). Because of the softmax layer this can be interpreted as a probability distribution over the 10 classes I am trying to predict (e.g. 10 digits if using MNIST). A typical output would then look like [0.1, 0.05, 0.3, ... , 0.02]
.
The problem then is that I am trying to fit this output, let's call it y_pred
with y_train
. But each example in y_train
is just a digit (e.g. 2), so it has shape (1,)
whereas my y_pred
has shape (10,)
, so there is a mismatch.
There are two ways you can solve this, either you can encode y_train
as a one-hot vector, i.e. a vector which is 1 at the digit represented and 0 otherwise. So for 2 this would be [0, 0, 1, 0, 0, ..., 0]
. You can use this one hot encoding to fit the model with categorical_crossentropy
as above.
Alternatively you can use sparse_categorical_crossentropy
which will take care of this transformation for you internally. The word sparse is used here because 2 is a sparse representation of [0, 0, 1, 0, ..., 0]
in the sense that it refers to the index of the non zero element.
@Emerson (None, 10)
is just a placeholder for an array with an unknown number of rows and 10 columns. During training this would typically take a shape (<batch_size>, 10)
depending on the size of your batch.
Hi, I got the same error below;
ValueError: Error when checking target: expected sequential_1 to have 4 dimensions, but got array with shape (1481, 3)
.
The input information is here like;
X_train=(1481, 64, 64, 3) / y_train=(1481, 3)
and y_train
is like categorized array of [[0 1 0]\n [1 0 0]\n ....]
and model function is below;
image_size = (64, 64)
input_image = Input(shape=(*image_size, 3))
base_model = VGG16(input_tensor=input_image, include_top=False)
top_model = Sequential()
top_model.add(Dense(3, input_shape=base_model.output_shape[1:], activation="softmax"))
model = Model(inputs=base_model.input, outputs=top_model(base_model.output))
I understand that this error is not as expected as the shape of the tensor of y_train, but I did not know how to solve this. Would you give me some advice if possible?
I had this same error before! I found out that the last layer is treated as the output layer, so make sure to change the last layer to Dense(OUTPUT_SIZE).
Hi, I use CNN for text classification and I got this error which is similar to the previous errors but i couldn't solve it with previous solutions. here is the error:
ValueError: Error when checking input: expected input_11 to have shape (None, 185) but got array with shape (1665, 35)
and here the information:
x_train shape: (1665, 35)
x_test shape: (185, 35)
Vocabulary Size: 4825
y_train 1665
y_test 185
Initializing embedding layer with word2vec weights, shape (4825, 300)
I would be grateful if you help me to figure it out.
@simarad Could you please share your code so that I could look into the issue?
yes, i used the code in :https://github.com/alexander-rakhlin/CNN-for-Sentence-Classification-in-Keras. and my main code is:
Train convolutional network for sentiment analysis on IMDB corpus. Based on
"Convolutional Neural Networks for Sentence Classification" by Yoon Kim
http://arxiv.org/pdf/1408.5882v2.pdf
For "CNN-rand" and "CNN-non-static" gets to 88-90%, and "CNN-static" - 85% after 2-5 epochs with following settings:
embedding_dim = 50
filter_sizes = (3, 8)
num_filters = 10
dropout_prob = (0.5, 0.8)
hidden_dims = 50
import numpy as np
from keras.models import Sequential, Model
from keras.layers import Dense, Dropout, Flatten, Input, MaxPooling1D, Convolution1D, Embedding
from keras.layers.merge import Concatenate
from keras.preprocessing import sequence
np.random.seed(0)
embedding_dim = 300
filter_sizes = (3, 8)
num_filters = 10
dropout_prob = (0.5, 0.8)
hidden_dims = 50
batch_size = 64
num_epochs = 10
sequence_length = 400
max_words = 5000
min_word_count = 1
context = 10
x, y, vocabulary, vocabulary_inv_list = x, y, vocabulary, vocabulary_inv
vocabulary_inv = {key: value for key, value in enumerate(vocabulary_inv_list)}
train_len = int(len(x) * 0.9)
x_train = x[:train_len]
y_train = y[:train_len]
x_test = x[train_len:]
y_test = y[train_len:]
print("Load data...")
if sequence_length != x_test.shape[0]:
print("Adjusting sequence length for actual size")
sequence_length = x_test.shape[0]
print("x_train shape:", x_train.shape)
print("x_test shape:", x_test.shape)
print("Vocabulary Size: {:d}".format(len(vocabulary_inv)))
print("y_train",len(y_train))
print("y_test", len(y_test))
print("x_train static shape:", x_train.shape)
print("x_test static shape:", x_test.shape)
if model_type == "CNN-static":
input_shape = (sequence_length, embedding_dim)
else:
input_shape = (sequence_length,)
model_input = Input(shape=input_shape)
if model_type == "CNN-static":
z = model_input
else:
z = Embedding(len(vocabulary_inv), embedding_dim, input_length=sequence_length, name="embedding")(model_input)
z = Dropout(dropout_prob[0])(z)
conv_blocks = []
for sz in filter_sizes:
conv = Convolution1D(filters=num_filters,
kernel_size=sz,
padding="valid",
activation="relu",
strides=1)(z)
conv = MaxPooling1D(pool_size=2)(conv)
conv = Flatten()(conv)
conv_blocks.append(conv)
z = Concatenate()(conv_blocks) if len(conv_blocks) > 1 else conv_blocks[0]
z = Dropout(dropout_prob[1])(z)
z = Dense(hidden_dims, activation="relu")(z)
model_output = Dense(1, activation="sigmoid")(z)
model = Model(model_input, model_output)
model.compile(loss="sparse_categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
if model_type == "CNN-non-static":
weights = np.array([v for v in embedding_weights.values()])
print("Initializing embedding layer with word2vec weights, shape", weights.shape)
embedding_layer = model.get_layer("embedding")
embedding_layer.set_weights([weights])
model.fit(x_train, y_train, batch_size=batch_size, validation_data=(x_test, y_test),epochs=num_epochs, verbose=2)
keras 1.2.0
python 2.7
from keras.models import Sequential, Model
from keras.layers.core import Activation, Flatten
from keras.layers import convolutional
class Bias(Layer):
"""Custom keras layer that simply adds a scalar bias to each location in the input
Largely copied from the keras docs:
http://keras.io/layers/writing-your-own-keras-layers/#writing-your-own-keras-layers
"""
def __init__(self, **kwargs):
super(Bias, self).__init__(**kwargs)
def build(self, input_shape):
self.W = K.zeros(input_shape[1:])
self.trainable_weights = [self.W]
def call(self, x, mask=None):
return x + self.W
defaults = {
"board": 10,
"filters_per_layer": 128,
"layers": 12,
"filter_width_1": 5
}
# copy defaults, but override with anything in kwargs
params = defaults
network = Sequential()
# create first layer
network.add(convolutional.Convolution2D(
input_shape=(6, 10, 10),
nb_filter=128,
nb_row=5,
nb_col=5,
init='uniform',
activation='relu',
border_mode='same'))
# create all other layers
for i in range(2, 13):
# use filter_width_K if it is there, otherwise use 3
filter_key = "filter_width_%d" % i
filter_width = params.get(filter_key, 3)
# use filters_per_layer_K if it is there, otherwise use default value
filter_count_key = "filters_per_layer_%d" % i
filter_nb = params.get(filter_count_key, 128)
network.add(convolutional.Convolution2D(
nb_filter=filter_nb,
nb_row=filter_width,
nb_col=filter_width,
init='uniform',
activation='relu',
border_mode='same'))
# the last layer maps each <filters_per_layer> feature to a number
network.add(convolutional.Convolution2D(
nb_filter=1,
nb_row=1,
nb_col=1,
init='uniform',
border_mode='same'))
# reshape output to be board x board
network.add(Flatten())
# add a bias to each board location
network.add(Bias())
# softmax makes it into a probability distribution
network.add(Activation('softmax'))
gives the following exception:
ValueError: Error when checking model target: expected activation_1 to have shape (None, 60) but got array with shape (10, 100)
the training data is a (10, 6, 10, 10) array , why model need (None, 60)
?
if chagne input_shape=(6, 10, 10)
to input_shape=(10, 10, 10)
, will get:
ValueError: Error when checking model input: expected convolution2d_input_1 to have shape (None, 10, 10, 10) but got array with shape (10, 6, 10, 10)
@EmilienDupont Can you help me? thank you very much.
Hi Guys,
Please help
this code of my script:
encoder_inputs = Input(shape=(None,))
x = data_object.embeddingLayer(encoder_inputs)
x, state_h, state_c = LSTM(embeddingDim, return_state=True)(x)
encoder_states = [state_h, state_c]
decoder_inputs = Input(shape=(None,))
x = data_object.embeddingLayer(decoder_inputs)
x = LSTM(embeddingDim, return_sequences=True)(x, initial_state=encoder_states)
decoder_outputs = Dense(maxNumWords, activation='softmax')(x)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.compile(optimizer='rmsprop', loss=lossFunction)
model.fit([array(inp_x), array(inp_y)], array(out_t), epochs=epochs)
returns the error
ValueError: Error when checking input: expected input_1 to have 2 dimensions, but got array with shape (99, 20, 1001)
the shapes of inp_x, inp_y and out_t are all (99, 20, 1001), I am not sure how to solve the error
@Phetsa
Have you been able to solve your issue? Seems I am having the same problem here. If you have, may I know how you went about it?
I was able to take care of this error by adding a reshape layer at the end. Example:
from keras.layers import Dense, Activation
from keras.models import Sequential
input_dim = 2
output_dim = 10
model = Sequential([
Dense(32, input_dim=input_dim),
Activation('relu'),
Dense(output_dim),
Activation('softmax'),
Reshape((output_dim,)
])
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd')
model.fit([[0,1], [1,1], [1,0]], [1,2,3])
Most helpful comment
@damianhinch
In the model above, the output at the end is 10-dimensional (
Dense(output_dim=10)
). Because of the softmax layer this can be interpreted as a probability distribution over the 10 classes I am trying to predict (e.g. 10 digits if using MNIST). A typical output would then look like[0.1, 0.05, 0.3, ... , 0.02]
.The problem then is that I am trying to fit this output, let's call it
y_pred
withy_train
. But each example iny_train
is just a digit (e.g. 2), so it has shape(1,)
whereas myy_pred
has shape(10,)
, so there is a mismatch.There are two ways you can solve this, either you can encode
y_train
as a one-hot vector, i.e. a vector which is 1 at the digit represented and 0 otherwise. So for 2 this would be[0, 0, 1, 0, 0, ..., 0]
. You can use this one hot encoding to fit the model withcategorical_crossentropy
as above.Alternatively you can use
sparse_categorical_crossentropy
which will take care of this transformation for you internally. The word sparse is used here because 2 is a sparse representation of[0, 0, 1, 0, ..., 0]
in the sense that it refers to the index of the non zero element.