Turing.jl: Simple syntax for querying the model and posterior sample

Created on 23 Nov 2019 · 30Comments · Source: TuringLang/Turing.jl

In this issue, I propose the creation of a new macro @logprob to query Turing models as well as the sampled posterior. Let's take the following model as an example using the syntax of #965:

@model demo(x, y) = begin
    a ~ Normal()
    b ~ Gamma()
    y .~ Normal.(a .* x, b)
end
model = demo(rand(100), rand(100))
chain = sample(model, NUTS(10, 0.65), 10000)

I propose the following syntax:

@logprob a = 1.0, b = 2.0 | model = model -> returns log the prior
@logprob a = 1.0, b = 2.0, x = rand(100), y = rand(100) | model = model -> returns log the joint probability
@logprob 0.2 <= a <= 0.3, 1.0 <= b <= 1.1 | model = model, chain = chain -> returns log the ratio of the number of samples in chain with 0.2 <= a <= 0.3, 1.0 <= b <= 1.1. For discrete distributions, we can also use a = 2 or a == 2 for example. Passing model here can be optional.
@logprob x = rand(10), y = rand(10) | model = model, chain = chain returns the log likelihood of x = rand(10), y = rand(10) for each sample in chain
@logprob x = rand(10), y = rand(10) | model = model, a = 1.0, b = 2.0 returns the log likelihood of x = rand(10), y = rand(10) using a = 1.0, b = 2.0.

Let me know if you have comments on the syntax or if I missed any use case.

Source

mohamed82008

All 30 comments

I can also provide an @prob macro that returns the probability instead of log the probability.

mohamed82008 on 23 Nov 2019

I suggest we also allow = and == to be used interchangeably in the examples above.

mohamed82008 on 23 Nov 2019

👍1

I like it, I think it'd be a good tool to have. I have no helpful comments, so I guess you're welcome?

cpfiffer on 25 Nov 2019

👍1

I am working on it.

mohamed82008 on 25 Nov 2019

@logprob a = 1.0, b = 2.0, x = rand(100), y = rand(100) | model = model -> returns log the joint probability

model is created by demo(rand(100), rand(100)) in which x and y are already provided. It's unclear to me what the log joint is supposed to return.

xukai92 on 26 Nov 2019

👍1

I agree, it feels a bit strange to me. I think it would be more natural to write

@logprob a = 1.0, b = 2.0 | model = demo

for the prior probabilities p(a = 1.0, b = 2.0) and

@logprob a = 1.0, b = 2.0, x = x, y = y | model = demo

for the joint probability p(a = 1.0, b = 2.0, x = x, y = y). I'm not even sure if it should be a macro, couldn't one just define a function logprob that allows to evaluate logprob(demo, (a = 1.0, b = 2.0)) or logprob(demo, (a = 1.0, b = 2.0, x = x, y = y))? Maybe for convenience one could even define demo.logprob instead which allows to write demo.logprob(a = 1.0, b = 2.0).

devmotion on 26 Nov 2019

@xukai92 the log joint probability case will replace the data inside model and compute the log joint probability for the given data and parameters. If that's not intuitive enough, I can limit this macro to the likelihood and posterior only. Convenience functions can then be made to calculate the prior and joint probability for NamedTuple parameters as @devmotion suggested. In case of the joint probability, the data inside the model can be used.

Actually, for all of them I can just make a function under the hood that can be called instead of the macro. So I have a few questions for you all:

Do you prefer a macro and a function API, a function only API or a macro only API?
Should I include the prior and joint probability syntax in the macro with the model = demo suggestion?
Is conveniently calculating the log joint probability for new data and new parameters useful in practice or not?

mohamed82008 on 26 Nov 2019

I would actually prefer to have both. People who do a lot of automated model stuff would prefer a function API, whereas all the folks who are messing around with the REPL prefer the macro API. Might be that one ends up being better than the other, so we could deprecate one if it is just less useful.

cpfiffer on 29 Nov 2019

Sounds good.

mohamed82008 on 2 Dec 2019

Re 1: I agree with Cameron that we do need both.
Re 2: I feel model = demo is more intuitive than model = model
Re 3: I think it's useful in general.

xukai92 on 2 Dec 2019

Will this allow this also allow to return the point logp of the each data point given a parameter set?

jonasmac16 on 11 Dec 2019

Yes

mohamed82008 on 11 Dec 2019

👍1

That's pretty cool. Is this available once the referenced PR is accepted?

jonasmac16 on 11 Dec 2019

Yup!

mohamed82008 on 11 Dec 2019

❤1

@logprob 0.2 <= a <= 0.3, 1.0 <= b <= 1.1 | model = model, chain = chain -> returns log the ratio of the number of samples in chain with 0.2 <= a <= 0.3, 1.0 <= b <= 1.1. For discrete distributions, we can also use a = 2 or a == 2 for example. Passing model here can be optional.

I am slightly confused by the above feature. It seems like this should be a filter-style function in the MCMCChains package. Also, what are the potential use cases?

yebai on 17 Dec 2019

Hypothesis testing, for one. If you wanted to have some measure of confidence about whether a parameter is negative and you wanted a quick check:

@logprob a >= 0 | model=model, chain=chain

I think it's a little nicer to work with than a Chains object, and it's very quick to write.

cpfiffer on 17 Dec 2019

I am slightly confused by the above feature. It seems like this should be a filter-style function in the MCMCChains package.

Indeed IMO that feature seems more like one of the other convenient functionalities provided by MCMCChains. It is also not clear to me why specifying model = model would be useful in that example?

The nice thing about using a function instead of a macro would be that one could just dispatch on Chains (or AbstractChains) and implement the chain-based evaluations in MCMCChains. Since there seems to be an agreement that also a macro is needed, maybe one solution could be to provide a generic @logpdf macro in Turing that parses the input and forwards NamedTuples (or something else?) to the functional variant. The macro feels slightly more restricted, however, since it seems to require some protected parameter names (such as model or chain, e.g.).

I think it's a little nicer to work with than a Chains object, and it's very quick to write.

To me logprob(x -> x.a >= 0, chain) or

logprob(chain) do sample
    sample.a >= 0
end

doesn't feel too bad and actually more powerful and flexible than a macro.

devmotion on 17 Dec 2019

To me logprob(x -> x.a >= 0, chain) or
logprob(chain) do sample
    sample.a >= 0
end

I like that! I think we could probably stick this on the MCMCChains side.

cpfiffer on 17 Dec 2019

I agree that this doesn't belong here. I actually left it out from #997 .

mohamed82008 on 18 Dec 2019

You're referring only to the filtering functionality, I guess? As far as I can see, https://github.com/TuringLang/Turing.jl/pull/997 contains the chain-based feature

Query the likelihood given every sample in the chain using prob"y = rand(3) | chain = chain, model = demo"

devmotion on 18 Dec 2019

@devmotion yes that feature is actually important, @trappmartin and @sethaxen have uses for it. Imo, I see this belongs to Turing (and in the future DynamicPPL) because the same VarNames we use in VarInfo are keys in the chain:Chain output by Turing.sample. I make use of that to feed chain[i] for example into the VarInfo before running the model in the LikelihoodContext. This may not be possible for any arbitrary chain coming from other packages, and going from chain to NamedTuple or another standard underlying representation is completely non-trivial for nested array types, e.g. when the parameter is an array of matrices. So model = demo, chain = chain makes sense for 2 reasons:

Both demo and chain are used to compute the likelihoods, and
chain was generated from demo to begin with and the 2 are compatible, while other seemingly equivalent models may not work with chain and other seemingly equivalent chains may not work with demo.

mohamed82008 on 18 Dec 2019

👍1

chain was generated from demo to begin with and the 2 are compatible, while other seemingly equivalent models may not work with chain and other seemingly equivalent chains may not work with demo.

To me that sounds like the syntax should be just y = rand(3) | chain = chain (without model = demo) to avoid any incompatibilities between chain and model, and just use the model saved in chain.info which was used to generate chain.

This may not be possible for any arbitrary chain coming from other packages

Sure, but does this imply that this functionality can't be provided in, e.g., MCMCChains?

devmotion on 18 Dec 2019

👍1

Hmm using the model in chain sounds good, I had missed the fact that we can store these in chain. The only catch is that the chain.info field optionally stores the model in Turing, but that's easily fixable. Thanks!

mohamed82008 on 18 Dec 2019

Sure, but does this imply that this functionality can't be provided in, e.g., MCMCChains?

Yes we need access to VarInfo which is not seen by MCMCChains, may be in a future PR we can consider moving things and interfacing.

mohamed82008 on 18 Dec 2019

So here is a nice middle ground, if model is defined in chain, we use that. Otherwise, we check if model was passed on the RHS. We do the same for varinfo. That way even if the user forgot to save model and VarInfo in chain, we can still save the day. Sounds good?

mohamed82008 on 18 Dec 2019

Hmm nevermind, that won't be necessary if chain always saves varinfo and model. Any reason not to do save them by default in my PR @cpfiffer?

mohamed82008 on 18 Dec 2019

Even if it is saved by default, I will leave the backup there.

mohamed82008 on 18 Dec 2019

No reason that I can think. You could probably just set the keyword save_state=true and everything should be good.

cpfiffer on 19 Dec 2019

Cool, thanks!

mohamed82008 on 19 Dec 2019

Closed via #997.

mohamed82008 on 25 Dec 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

rand() and broadcast question

mohamed82008 · 3Comments

Parameter values initialization does not speed up NUTS

ClaudMor · 4Comments

JuliaCon hackathon?

hessammehr · 4Comments

Variational Bayes inference?

skanskan · 5Comments

Error with Gibbs compositional sampler and an array

mateuszbaran · 5Comments