The model in the VAE tutorial has a Bernoulli observation model. However, the images from MNIST (after preprocessing) have values in [0, 1], which are not valid observations under this model. I think one has to binarize these images before feeding them to the model/guide, either by sampling the pixels or with a threshold.
We are aware of this and report this explicitly, this was chosen for
convenience to avoid arbitrary binarizations following established
methodology ranging back to rbm-literature. The model can still train on
mnist and basically works, but the likelihoods in the non-binarized case
are of course slightly inflated.
For a paper you would download a binarized version and report the resulting
slightly worse likelihoods, but the model would behave the same as here.
You could also fix this if you want to be precise by sampling from
dist.bernoulli with the loaded images as parameters to the distribution and
using the samples as data with the given model.
Just be aware to only do this once, as repeated sampling of the data
provides unfair regularization to the model and also inflates likelihood
scores.
On Tue, Nov 7, 2017, 4:20 PM Tristan Deleu notifications@github.com wrote:
The model in the VAE tutorial http://pyro.ai/examples/vae.html has a
Bernoulli observation model. However, the images from MNIST (after
preprocessing) have values in [0, 1], which are not valid observations
under this model. I think one has to binarize these images before feeding
them to the model/guide, either by sampling
https://github.com/blei-lab/edward/blob/081ea532a982e6d2c88da25d6e2527f6a66f09ab/examples/vae.py#L38
the pixels or with a threshold
https://github.com/altosaar/variational-autoencoder/blob/1944d3a2eca4730339519cae557533f482237be1/vae.py#L169
.—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/uber/pyro/issues/529, or mute the thread
https://github.com/notifications/unsubscribe-auth/ABVhL7uv6gxwZ26JU61V_nZvQlLajJ_Iks5s0PPUgaJpZM4QVqS3
.
but why do we do this slightly hacky thing? eg why not just use a continuous observation model?
In my experience for mnist to work with a VAE the best thing is to use as sample the expected value of the decoder distribution. In case of using a bernouilli decoder is make sense as the binary cross entropy you minimize is given by:
xln p + (1-x)ln(1-p)
where x represent the pixel value and p the predicted mean of the bernouilli distribution and you are trying to match this value to the observed pixel. In this case our distribution takes the form:
p(x|z)=p^x*(1-p)^(1-x) and in my opinion that is why it seems reasonable to sample the expected value, because it is not a proper defined bernouilli distribution.
Check bishop neural networks book section 6.7
Hope it helps¡
Most helpful comment
but why do we do this slightly hacky thing? eg why not just use a continuous observation model?