Error when checking target: expected dense_2 to have shape (20,) but got array with shape (1000,) while running pretrained_word_embeddings.py
https://github.com/keras-team/keras/blob/master/examples/pretrained_word_embeddings.py
the error is related to your network as far as I can understand.
Can you post your code or the initial part?
I can assume that you did not divided the network as it must be. You have a shape of 20 but you try to pass 1000 without batches.
Please find the code below
`BASE_DIR = 'C:\Users\'
GLOVE_DIR = os.path.join(BASE_DIR, 'glove.6B')
TEXT_DATA_DIR = os.path.join(BASE_DIR, 'news20\20_newsgroup')
MAX_SEQUENCE_LENGTH = 1000
MAX_NUM_WORDS = 20000
EMBEDDING_DIM = 100
VALIDATION_SPLIT = 0.2
EMBEDDING_DIM = 100
embeddings_index = {}
with open(os.path.join(GLOVE_DIR, 'glove.6B.100d.txt'), encoding="utf8") as f:
for line in f:
values = line.split()
word = values[0]
coefs = np.asarray(values[1:],dtype = 'float32')
embeddings_index[word] = coefs
texts = [] # list of text samples
labels_index = {} # Dictionary mapping of label to label id
labels = [] # List of label ids
for name in sorted(os.listdir(TEXT_DATA_DIR)):
path = os.path.join(TEXT_DATA_DIR, name)
if os.path.isdir(path):
label_id = len(labels_index)
labels_index[name] = label_id
for fname in sorted(os.listdir(path)):
if fname.isdigit():
fpath = os.path.join(path, fname)
args = {} if sys.version_info < (3,) else {'encoding':'latin-1'}
with open(fpath, **args) as f:
t = f.read()
i = t.find("\n\n") # Skips header
if 0 < i:
t = t[i:]
texts.append(t)
labels.append(label_id)
tokenizer = Tokenizer(num_words = MAX_NUM_WORDS)
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)
word_index = tokenizer.word_index
data = pad_sequences(sequences, maxlen = MAX_SEQUENCE_LENGTH)
labels = to_categorical(np.asarray(labels))
indices = np.arange(data.shape[0])
np.random.shuffle(indices)
data = data[indices]
labels = labels[indices]
num_validation_samples = int(VALIDATION_SPLIT*data.shape[0])
x_train = data[:-num_validation_samples]
y_train = data[:-num_validation_samples]
x_test = data[-num_validation_samples:]
y_test = data[-num_validation_samples:]
num_words = min(MAX_NUM_WORDS, len(word_index)+1)
embedding_matrix = np.zeros((num_words, EMBEDDING_DIM))
for word, i in word_index.items():
if i >= MAX_NUM_WORDS:
continue
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
embedding_matrix[i] = embedding_vector
embedding_layer = Embedding(num_words,
EMBEDDING_DIM,
embeddings_initializer=Constant(embedding_matrix),
input_length = MAX_SEQUENCE_LENGTH,
trainable = False)
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype ='int32')
embedded_sequences = embedding_layer(sequence_input)
x = Conv1D(128, 5, activation = 'relu')(embedded_sequences)
x = MaxPooling1D(5)(x)
x = Conv1D(128, 5, activation = 'relu')(x)
x = MaxPooling1D(5)(x)
x = Conv1D(128, 5, activation = 'relu')(x)
x = GlobalMaxPooling1D()(x)
print(x.shape)
x = Dense(128, activation='relu')(x)
print(x.shape)
preds = Dense(len(labels_index), activation='softmax')(x)
model = Model(sequence_input, preds)
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
model.fit(x_train, y_train, batch_size=128, epochs = 10, validation_data=(x_test, y_test))`
I am not sure what is costing that problem.
Wild guess will be that the lables_index is 20 and that is creating the problem. Can you confirm the output of len(labels_index)?
output is 20 to predict 20 Categories
Github is for issues in Keras while this is just an implementation error. In future please open stackoverflow questions instead of posting on Github:
x_train = data[:-num_validation_samples]
y_train = data[:-num_validation_samples] . # supposed to be labels
x_test = data[-num_validation_samples:]
y_test = data[-num_validation_samples:] . # supposed to be lablels
Thank you so much for your help
I have a similiar error message: "ValueError: Error when checking target: expected dense_2 to have shape (10,) but got array with shape (1,)", and it seems that what is causing this problem is the loss parameter in the model architecture, when I use loss = 'sparse_categorical_crossentropy' gives me no error but when using this one, it does loss = 'categorical_crossentropy'
I have a similiar error message: "ValueError: Error when checking target: expected dense_2 to have shape (10,) but got array with shape (1,)", and it seems that what is causing this problem is the loss parameter in the model architecture, when I use
loss = 'sparse_categorical_crossentropy'gives me no error but when using this one, it doesloss = 'categorical_crossentropy'
But even after i changed that still i got that error but with less error shape
Most helpful comment
I have a similiar error message: "ValueError: Error when checking target: expected dense_2 to have shape (10,) but got array with shape (1,)", and it seems that what is causing this problem is the loss parameter in the model architecture, when I use
loss = 'sparse_categorical_crossentropy'gives me no error but when using this one, it doesloss = 'categorical_crossentropy'