Pymc3: Forward sampling functions should warn that they ignore Potential in a model

Created on 31 Mar 2020 · 13Comments · Source: pymc-devs/pymc3

Description of your problem

As discussed here with @junpenglao, it seems like sample_posterior_predictive cannot account for a Potential in a model with Geometric likelihood:

with pm.Model() as m_bike:
    a = pm.Normal("a", 0.0, 0.5)
    bT = pm.Normal("bT", 0.0, 0.2)

    p = pm.math.invlogit(a + bT * bike_data["temp_std"])

    bike_count = pm.Geometric('bike_count', p, observed=bike_data["count"])
    pm.Potential('constraint', tt.switch(bike_count > 1100, -np.inf, 0.))

    trace_bike_poisson = pm.sample(1000, tune=2000, random_seed=RANDOM_SEED)
    post_samples = pm.sample_posterior_predictive(
        trace_bike_poisson, random_seed=RANDOM_SEED
    )

idata = az.from_pymc3(trace=trace_bike_poisson, posterior_predictive=post_samples)
az.plot_ppc(idata);

This samples perfectly, but the constraint was not applied to PP samples:
Capture d’écran 2020-03-10 à 21 27 48
The data come from Kaggle’s bike-sharing demand contest.

Versions and main components

PyMC3 Version: Master
Theano Version: 1.0.4
Python Version: 3.7.6
Operating system:OS X
How did you install PyMC3: pip

Source

AlexAndorra

All 13 comments

By design, potential's only affect the model's logp. This means that they only have an impact in pm.sample. Forward sampling (prior and posterior predictive sampling) completely ignore potentials. As @junpenglao said over at discourse this issue should actually be: "forward sampling should warn that potential's are completely ignored".

lucianopaz on 31 Mar 2020

👍1

Similarly, transforms are (mostly) ignored. Except when they aren't (e.g., TruncatedNormal).

rpgoldman on 31 Mar 2020

As we've discussed elsewhere, what we call a distributions transform in pymc3 is actually just an operation that is used in the logp to get the distribution values into an unconstrained space. Said operation is never used in forward sampling. For example, the TruncatedNormal, the Uniform or the HalfNormal distributions all get automatically transformed to help pm.sample in the unconstrained space. Their random methods produce samples in the correct constrained space by construction. The misleading part comes when a user specifies a distributions transform argument. That custom operation is never used in forward sampling, but the problem I'm that case comes from the fact that the transform argument should have never been called transform in the first place.

lucianopaz on 1 Apr 2020

👍1

Well, arguably if you specify a transform, and we can't support generating transformed samples, then IMO forward sampling should just raise an error on transformed RVs and give up.

Unfortunately, the docs here are quite opaque. Here's the definition of the Transform class, which is almost completely unhelpful:

Transform instances are the entities that should be used in the transform parameter to a random variable constructor.

... and the transform option for variable creation is nowhere documented. Part of this problem is that the notion of "random variable" is nowhere documented -- or at least not in the API section. So I guess I'm not sure what the transform option is intended to do (or why it's exposed to the programmer).

rpgoldman on 1 Apr 2020

Can we have a check of the logp when forward sampling that rejects the sample if the constraint is not satisfied?

fonnesbeck on 25 Apr 2020

👍1

I thought it was just me, but then we got a bug report where someone passed None to the transform initarg and it broke sampling!

rpgoldman on 26 Apr 2020

@lucianopaz @fonnesbeck -- could we address this issue by having the forward sampling functions use importance sampling? I.e., could we evaluate the potential logp on each sample and use that as the weight for the sample? (and I'm kind of hazy on the math here -- we might have to take the antilog of the potential value, right?)

I'm not sure how possible it would be to implement this, since all the code having to do with traces have deeply baked in the assumption that all the trace elements have the same weight.

I suppose one could instead sub-sample from the original trace using a weighting derived from the potential value at each sample to develop a new, weighted trace.

I'm not sure how else one could incorporate a potential in forward sampling, but this is just the project of a few minutes musing while I was on a walk.

rpgoldman on 29 Apr 2020

Yeah, sampling-importance-resampling would be more general than just rejecting constraints since it would deal with the case where the Potential is not just returning -inf values as the constraint.

fonnesbeck on 5 May 2020

could we address this issue by having the forward sampling functions use importance sampling? I.e., could we evaluate the potential logp on each sample and use that as the weight for the sample?

Yes, this could be done. However, I'm not in favor of doning it by default.

Drawing forward samples from a model with potentials could be done with particle filters like importance sampling, SIR or SMC, but it could also be done using mcmc methods like NUTS or Gibbs. The difference between pm.sample and our forward sampling counterparts will depend on whether the user specified observations, or if it clamped the values of the RVs to values stored in some trace.

I think that we should simply warn the user that forward sampling ignores potentials and direct it to use some other Montecarlo method if it interested in always satisfying the potentials.

lucianopaz on 7 May 2020

@lucianopaz So it seems that what is needed is:

A patch to the forward sampling functions that will warn in case there is a potential in the model.
A notebook that shows how to use sample for predictive sampling (presumably by stripping out the observations from a model and then sampling?)

rpgoldman on 16 May 2020

Working on 1. is on my to-do list -- hopefully I'll work on it by the end of May 😊
I'm not sure I understand what 2. means, but this could make sense to do it in the same PR 🤷‍♂️

AlexAndorra on 16 May 2020

Working on 1. is on my to-do list -- hopefully I'll work on it by the end of May blush
I'm not sure I understand what 2. means, but this could make sense to do it in the same PR man_shrugging

I think step 2, was just about having a notebook where you use normal pm.sample() in a model with Potentials, but no observed variables, to obtain the same prior_predictive_sampling that one would get if the prior_predictive_sampling was able to work with models with Potentials.

What one would desire:

with pm.Model():
    a = pm.Normal('a', 0, 1)
    p = pm.Potential('constraint', a**2)

    like = pm.Normal('like', a, 1, observed=[0, 0, 1, 2])
    prior_ppc = pm.sample_prior_predictive()

How it can be obtained:

with pm.Model():
    a = pm.Normal('a', 0, 1)
    p = pm.Potential('constraint', a**2)

    like = pm.Normal('like', a, 1)
    prior_ppc = pm.sample()

It does not solve the problem of posterior predictive samples though...

ricardoV94 on 31 Jul 2020

Thanks @ricardoV94! Having thought a bit more about this now, I think that's exactly that 👌
As you say, this doesn't solve post pred sampling. A relatively simple way to solve this (and _prior_ pred sampling at the same time) would probably be to add a rejection sampler when protentials are present in the model.
Something to have in mind though is that it could become very inefficient if the prior is too vague with respect to the potential's constraints (a lot of proposed samples would be rejected).
Unfortunately I don't have time to work on this right now, but a low hanging fruit would be to implement the _UserWarning_ from step 1) above

AlexAndorra on 31 Jul 2020

Was this page helpful?

0 / 5 - 0 ratings