I am having issues with trying to get the sample variational autoencoder working. Running the script verbatim from the site but keep getting results that look like all locations from the latent space are producing the same output and the distribution of test images over the latent space is no where near as spread out. Running on the latest version of Theano and Keras.


Please make sure that the boxes below are checked before you submit your issue. Thank you!
This behaviour also struct me for a while. Since I found that the standard autoencoder with linear encoder works (meaning that the 'recognition model' in VAE becomes Gaussian with zero std), my suspicion went to the gaussian sampling strategy.
So I changed the code line generating the epsilon by decreasing the standard deviation as follows
epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0., std=0.1)
and it makes the VAE behaves correctly in this particular case -- it might need different value for other network parameters, learning settings, and / or datasets.
Oh, you would also need to rescale the z_sample when generating the digit samples:
z_sample = np.array([[xi, yi]]) * 0.1
I believe this is broken when the commit f6bcaffe is brought in, which makes the reconstruct loss extensively small by removing the multiplication of original_dim in Line 42. The problem should be fixed by bringing the multiplication back as
xent_loss = original_dim * objectives.binary_crossentropy(x, x_decoded_mean)
or updating the whole calculation as
xent_loss = K.sum(K.binary_crossentropy(x_decoded_mean, x), axis=-1)
I feel like the current implementation has a redundancy by taking the mean of the loss and multiplying it by the number of dimensions again? Why not just taking the sum of the whole loss?
Thanks,
Thanks @tushuhei for the better answer. Didn't realise that before. I've reverted back to the default std and used your suggested loss function. It works like charm !
Bug can be closed then.
@stuartlynn @ghif In line 69 of the example, shouldn't the epsilon (noise) be set to zero for plotting test examples into the latent space? (Or plot them many times to show the spread.)
This behaviour also struct me for a while. Since I found that the standard autoencoder with linear encoder works (meaning that the 'recognition model' in VAE becomes Gaussian with zero std), my suspicion went to the gaussian sampling strategy.
So I changed the code line generating the epsilon by decreasing the standard deviation as follows
epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0., std=0.1)and it makes the VAE behaves correctly in this particular case -- it might need different value for other network parameters, learning settings, and / or datasets.
Oh, you would also need to rescale the z_sample when generating the digit samples:
z_sample = np.array([[xi, yi]]) * 0.1
Thank you! I meet the same problem in my own project, and this method really helps!!!
Most helpful comment
I believe this is broken when the commit f6bcaffe is brought in, which makes the reconstruct loss extensively small by removing the multiplication of
original_dimin Line 42. The problem should be fixed by bringing the multiplication back asor updating the whole calculation as
I feel like the current implementation has a redundancy by taking the mean of the loss and multiplying it by the number of dimensions again? Why not just taking the sum of the whole loss?
Thanks,