Howdy,
I have a dataset where the output is one of 5k categories. I also have millions of samples. The naive representation of y_indices_naive (the outputs) is:
[1,5,4300,...]
But it seems that Keras/Theano require one-hot encodings of the output.
Problem is, np_utils.to_categorical(y_indices_naive) causes an out-of-memory error because then I need a 3mil x 3k matrix.
Is there any way to get Keras to accept y_indices_naive without converting it to one-hot? I would be happy to add some code if someone would point out how to best do it.
Theano has no support for sparse operations as far as I know (and Keras certainly doesn't either). So all data will have to be converted to dense arrays at some point.
However a 5k-dimensional output space doesn't seem very large to me.
You can solve your OOM error by one-hot encoding and training batch-by-batch instead of 3M samples at once. Break down your dataset into small batches, and for each batch:
y_batch = np_utils.to_categorical(y_indices_batch, nb_classes=5000)
model.train_on_batch(X_batch, y_batch)
As long as 1) your model fits in memory and 2) your batches are small enough, this will not cause any memory issues.
Theano 'tensor.nnet.categorical_crossentropy' can accept vector of integers as true distribution.
ps thanks for your great library btw!
@lightcaster Thanks for the response, and sorry for the long delay.
It does indeed look like tensor.nnet.categorical_crossentropy allows the output to be a vector of integers, but I am not sure how to get Keras and Theano to play nice here.
Here is my model:
rnn_dim = 512
dense_dim = 512
model = Sequential()
model.add(Embedding(n_symbols + 1, rnn_dim, mask_zero=True))
model.add(GRU(rnn_dim, dense_dim, return_sequences=False))
model.add(Dropout(0.5))
model.add(Dense(dense_dim, n_symbols, activation='sigmoid'))
Note that the output is n_symbols in dimension, so if I try to do model.fit with just a vector of integers it throws me this error:
ValueError: GpuElemwise. Input dimension mis-match. Input 1 (indices start at 0) has shape[1] == 1, but the output's size on that axis is 4938
Any ideas on what to do?
The target is going to be normalized and reshaped by model.fit. You can
avoid this by implementing your own version of model.train_on_batch / etc
that would simply be calling directly the Thenao functions model._train,
models._test, model._predict.
We'll look into more direct support.
On 9 September 2015 at 11:09, Sergey Feldman [email protected]
wrote:
@lightcaster https://github.com/lightcaster Thanks for the response,
and sorry for the long delay.It does indeed look like tensor.nnet.categorical_crossentropy allows the
output to be a vector of integers, but I am not sure how to get Keras and
Theano to play nice here.Here is my model:
rnn_dim = 512
dense_dim = 512
model = Sequential()
model.add(Embedding(n_symbols + 1, rnn_dim, mask_zero=True))
model.add(GRU(rnn_dim, dense_dim, return_sequences=False))
model.add(Dropout(0.5))
model.add(Dense(dense_dim, n_symbols, activation='sigmoid'))Note that the output is n_symbols in dimension, so if I try to do
model.fit with just a vector of integers it throws me this error:ValueError: GpuElemwise. Input dimension mis-match. Input 1 (indices start at 0) has shape[1] == 1, but the output's size on that axis is 4938
Any ideas on what to do?
—
Reply to this email directly or view it on GitHub
https://github.com/fchollet/keras/issues/483#issuecomment-138994755.
Ah great, thanks. I'll see about calling the Theano functions directly.
I am facing a similar problem, number of output classes in my case is 50000 and the loss is 'categorical_crossentropy'. If I pass index of 1 in 1 hot encoding, keras complains about the shape to be of 3 dimensions. I checked in theano's T.nnet.categorical_crossentropy and it accepts index of input in 1 hot encoding rather than full 1 hot encoding vector. Can't keras also support this functionality?
You should be using sparse_categorical_crossentropy instead, which
accepts label indices rather than one-hot encoded labels.
On 18 April 2016 at 23:29, Shashank Gupta [email protected] wrote:
I am facing a similar problem, number of output classes in my case is
50000 and the loss is 'categorical_crossentropy'. If I pass index of 1 in 1
hot encoding, keras complains about the shape to be of 3 dimensions. I
checked in theano's T.nnet.categorical_crossentropy and it accepts index of
input in 1 hot encoding rather than full 1 hot encoding vector. Can't keras
also support this functionality?—
You are receiving this because you modified the open/close state.
Reply to this email directly or view it on GitHub
https://github.com/fchollet/keras/issues/483#issuecomment-211755618
@fchollet I tried it, but it was also giving same error (expecting 3D input but got 2D instead), so I switched to batch training mode with output label encoded in 1 hot encoding.
sparse_categorical_crossentropy works fine (it's unit-tested, and I use it
regularly), so your problem lies elsewhere entirely.
On 19 April 2016 at 10:11, Shashank Gupta [email protected] wrote:
@fchollet https://github.com/fchollet I tried it, but it was also
giving same error (expecting 3D input but got 2D instead), so I switched to
batch training mode with output label encoded in 1 hot encoding.—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/fchollet/keras/issues/483#issuecomment-212022120
@fchollet Thanks for your reply, I'll have a look at my code and check where the problem lies
@shashankg7 Did you solve (expecting 3D input but got 2D instead) problem ?I am also facing this.
@shashankg7 Did you solve (expecting 3D input but got 2D instead) problem ?I am also facing this. @fchollet
Can you guys please help!! #5662
The trick to fix issue with the error expecting a 3D input when using sparse_categorical_crossentropy is to format outputs in a sparse 3-dimensional way. So instead of formatting the output like this:
y_indices_naive = [1,5,4300,...]
is should be formatted this way:
y_indices_naive = [[1,], [5,] , [4300,],...]
That will make Keras happy and it'll trained the model as expected.
Most helpful comment
The trick to fix issue with the error expecting a 3D input when using sparse_categorical_crossentropy is to format outputs in a sparse 3-dimensional way. So instead of formatting the output like this:
y_indices_naive = [1,5,4300,...]is should be formatted this way:
y_indices_naive = [[1,], [5,] , [4300,],...]That will make Keras happy and it'll trained the model as expected.