Pymc3: Vector of i.i.d. Poisson gives incorrect distribution for vector size > 50

Created on 8 Aug 2018 · 3Comments · Source: pymc-devs/pymc3

If you have questions about a specific use case, or you are not sure whether this is a bug or not, please post it to our discourse channel: https://discourse.pymc.io

Description of your problem

I've been trying to recreate the blog post on a principled Bayesian workflow here and ran into trouble on step five of section 3.1 where a Poisson with shape larger than about 70 has a sampled distribution strongly peaked around its mu value, with only that value in the sample where N>=100
Please provide a minimal, self-contained, and reproducible example.

import pymc3 as pm
import matplotlib.pyplot as plt

N = 1000
generative_model = pm.Model()
with generative_model:
    y = pm.Poisson("y",mu=5.0,shape=(N,))
    trace = pm.sample(1000)

plt.hist(trace['y'].flatten());

Please provide any additional information below.
Tested both in Python 3.6 and 2.7. Setting N=1 gives correct distribution. Same is true for switching Poisson to another distribution such as Normal.

Versions and main components

PyMC3 Version: 3.5
Theano Version: 1.0.2
Python Version: 3.6
Operating system: 3.6.0 |Continuum Analytics, Inc.| (default, Dec 23 2016, 13:19:00)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
How did you install PyMC3: conda

Source

sempwn

Most helpful comment

discrete sampling is unreliable using metropolis with low number of samples (ie, 1000 in this case). For sampling from prior you should use pm.sample_prior

junpenglao on 8 Aug 2018

👍2

All 3 comments

@sempwn thanks for taking the time to report this issue. This looks like a limitation of the metropolis sampler as this also happens with other discrete distributions.

aloctavodia on 8 Aug 2018

👍2

discrete sampling is unreliable using metropolis with low number of samples (ie, 1000 in this case). For sampling from prior you should use pm.sample_prior

junpenglao on 8 Aug 2018

👍2

@junpenglao Thank you! For posterity here is a minimal working example

model = pm.Model()
N = 1000
R = 500
y_data = np.random.poisson(lam=5.0,size=N)
with model:    
    y = pm.Poisson("y",mu=0.5,shape=(N,),observed=y_data)

    trace = pm.sample_prior_predictive(samples=R)

plt.hist(trace['y'])

sempwn on 8 Aug 2018

Was this page helpful?

0 / 5 - 0 ratings