Pymc3: Vector of i.i.d. Poisson gives incorrect distribution for vector size > 50

Created on 8 Aug 2018  路  3Comments  路  Source: pymc-devs/pymc3

If you have questions about a specific use case, or you are not sure whether this is a bug or not, please post it to our discourse channel: https://discourse.pymc.io

Description of your problem

I've been trying to recreate the blog post on a principled Bayesian workflow here and ran into trouble on step five of section 3.1 where a Poisson with shape larger than about 70 has a sampled distribution strongly peaked around its mu value, with only that value in the sample where N>=100
Please provide a minimal, self-contained, and reproducible example.

import pymc3 as pm
import matplotlib.pyplot as plt

N = 1000
generative_model = pm.Model()
with generative_model:
    y = pm.Poisson("y",mu=5.0,shape=(N,))
    trace = pm.sample(1000)

plt.hist(trace['y'].flatten());

Please provide any additional information below.
Tested both in Python 3.6 and 2.7. Setting N=1 gives correct distribution. Same is true for switching Poisson to another distribution such as Normal.

Versions and main components

  • PyMC3 Version: 3.5
  • Theano Version: 1.0.2
  • Python Version: 3.6
  • Operating system: 3.6.0 |Continuum Analytics, Inc.| (default, Dec 23 2016, 13:19:00)
    [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
  • How did you install PyMC3: conda

Most helpful comment

discrete sampling is unreliable using metropolis with low number of samples (ie, 1000 in this case). For sampling from prior you should use pm.sample_prior

All 3 comments

@sempwn thanks for taking the time to report this issue. This looks like a limitation of the metropolis sampler as this also happens with other discrete distributions.

discrete sampling is unreliable using metropolis with low number of samples (ie, 1000 in this case). For sampling from prior you should use pm.sample_prior

@junpenglao Thank you! For posterity here is a minimal working example

model = pm.Model()
N = 1000
R = 500
y_data = np.random.poisson(lam=5.0,size=N)
with model:    
    y = pm.Poisson("y",mu=0.5,shape=(N,),observed=y_data)

    trace = pm.sample_prior_predictive(samples=R)

plt.hist(trace['y'])
Was this page helpful?
0 / 5 - 0 ratings

Related issues

fonnesbeck picture fonnesbeck  路  3Comments

Abraxas2071 picture Abraxas2071  路  4Comments

aakhmetz picture aakhmetz  路  4Comments

junpenglao picture junpenglao  路  5Comments

jonathanhfriedman picture jonathanhfriedman  路  6Comments