Pymc3: Sample Posterior Predictive Behavior

Created on 23 Sep 2018 · 7Comments · Source: pymc-devs/pymc3

This is a proposal to update sample_posterior_predictive behavior.

Consider:

with pm.Model():
    x = pm.Normal('x')
    y = pm.Normal('y', mu=x, observed=1)
    posterior = pm.sample(draws=500, chains=4)
    posterior_predictive = pm.sample_posterior_predictive(posterior)

print(posterior_predictive['y'].shape)

I propose this should print (2000,).

Current behavior is (500,), and the implementation is roughly:

indices = np.random.randint(0, len(trace), samples)
for idx in indices:
    point = trace[idx]
    ppc_samples = draw_values([var], point=point)
return np.array(ppc_samples)

Good things:

Supports any number of samples
They are asymptotically coming from the posterior predictive distribution
Samples are perhaps less correlated

Bad things:

~Ignores all but the first chain~
Ignores some of the ~first chain's~ posterior samples ~too~
Can not match posterior samples with ppc draws for diagnostics (this came up in arviz)

This would be a reasonably large breaking change! In a perfect world, I would want the return value to always be (4, 500), but if we make it so .reshape(4, 500) matches up with the shape of the full trace. My suggestion is:

Change the default to what I have described,
Add a shuffle argument
If samples is bigger than 2000 and shuffle is False, cycle back over the posterior (in order).

Source

ColCarroll

Most helpful comment

junpenglao - Yes, the impetus for this is in ArviZ, the dimensions allow you to select by chain and draw, so having that consistent would be nice. I could imagine ppc joint scatter plots being interesting (or maybe not interesting at all?) The scatter plots would require that these dimensions be aligned.

twiecki - This would also stop shuffling the samples (which seems fine to me)

This seems like a positive enough response that I'll make a PR!

ColCarroll on 24 Sep 2018

👍2

All 7 comments

Ignores all but the first chain

That's not quite the case - it randomly samples from all 2000 posterior samples and generates 500 posterior predictive samples.

https://github.com/pymc-devs/pymc3/blob/14b3fd01340c269dcc086e5b12e1bc503502c16e/pymc3/sampling.py#L1126-L1131

junpenglao on 23 Sep 2018

You're right, thanks - I read line 1131 from there, instead of 1129. I'll update the issue.

ColCarroll on 23 Sep 2018

Cool - also

Add a shuffle argument

is probably not needed.

I am curious about the context of the purpose change: is it for it to easier to work with using arviz? i guess Stan return the posterior predictive samples the same (chains, draws, ...) shape as the MCMC samples.

junpenglao on 23 Sep 2018

So really the proposal is to sample the number of total samples, which makes sense to me.

twiecki on 23 Sep 2018

👍1

twiecki - This would also stop shuffling the samples (which seems fine to me)

This seems like a positive enough response that I'll make a PR!

ColCarroll on 24 Sep 2018

👍2

@ColCarroll @junpenglao @twiecki I had some trouble trying to understand how posterior predictive sampling worked until I looked at the source code. If it is ok with you I will submit a PR to expand the docs a little and I was also thinking in logging some warning when samples != nchains*ndraws. Or at least when samples<<nchains*ndraws, as whole chains are not taken into account.

I am not really sure about the use of size parameter (and even though ArviZ has no support for it, there are no issues related yet). Can I discourage its use in the docs?