Turing.jl: API design for non-Turing samplers

Created on 12 Aug 2019 · 14Comments · Source: TuringLang/Turing.jl

One important design goal of Turing is to making implementing new sampling algorithms easier. There is already an ongoing effort for decoupling modelling and inference algorithms (see. e.g. https://github.com/TuringLang/Turing.jl/pull/793, https://github.com/TuringLang/Turing.jl/issues/746, https://github.com/TuringLang/Turing.jl/pull/750). For the next steps, it would be helpful to

discuss concretely what APIs are required from the compiler to implement a new sampler
implement a few Turing wrappers for the existing sampler packages, e.g.
- https://github.com/anthofflab/RobustAdaptiveMetropolisSampler.jl
- https://github.com/aplavin/Polychord.jl
- https://github.com/alan-turing-institute/PDSampler.jl
- https://github.com/QuantifyingUncertainty/GeneralizedMetropolisHastings.jl
- https://github.com/mauro3/KissMCMC.jl
- https://github.com/madsjulia/AffineInvariantMCMC.jl
- A C++ library of Markov Chain Monte Carlo (MCMC) methods https://www.kthohr.com/mcmclib.html

https://github.com/madsjulia/Mads.jl - https://lanl.gov/

Related: #895 #889

discussion doc

Source

yebai

👍2

All 14 comments

I think the following additional functions could be a good start. (I think that the assume and observe statements should be kept.) Please correct me if I’m wrong or if I’m missing something.

parameters(model) -> Vector{VarName}
prior(model, vn) -> Distribution
logpdf(model, vi)
logjoint(model, vi)
rand(model) -> VarInfo

In addition it would be good in my opinion to have a way to track the results (logpdf) of each assume and/or each observe statement. This could be stored in an additional data type called Trace.

For this purpose I would add the following macros.

@addlogjoint logpdf(d, x) # add return value to log joint using current VarInfo
@track logpdf(d, x) # track log pdf value
@track z = x-1 # track result of expression

Those macros can be used by the user in the model block to track expressions or by the developer of inference algos.

trappmartin on 12 Aug 2019

👍1

Are we assuming for this conversation that people are building a new InferenceAlgorithm, or do you want the discussion to be broader to include cases where people get rid of Turing's Sampler struct?

If the goal is just to look at new inference functions, we can streamline what we have pretty well. By default #793 will add a VarInfo to the Sampler if you don't provide a SamplerState object. It might also be worth moving the VarInfo to the Sampler struct itself rather than in the state, so that it's basically impossible for a user to _not_ have a VarInfo we can just manipulate for them with some common functions.

Some general functions that would work using the enforced VarInfo scheme:

parameters(model, spl) or get_parameters(model, spl) to retrieve a vector of parameters or a NamedTuple
parameters!(model, spl, val) or set_parameters!(model, spl, val) to set the parameters to a value (like vi[spl])
For MH-like samplers, it might be useful to have propose(model, spl) that returns an AbstractProposal type with a convenient default like the MH proposal. The proposal might include a log PDF, some parameters values, and maybe some other stuff. Then you could call accept!(model, spl, proposal) and it would handle all the updating for you. Users who cared a lot about the proposal mechanism could overload or wrap this function if they needed to.

One thing that we might look at is how close we can make the code in Turing to the pseudo-code you might see in an algorithm block in a new paper. Turing's pretty good on the model side, but the pseudo-code side is not quite as intuitive. I think the functions mentioned by @trappmartin would go a long way towards making everything generally accessible.

cpfiffer on 13 Aug 2019

👍1

I like we can make this in several iterations and get rid of the Sampler in the last iteration? It would probably already be nice if one can interface its own package, e.g. nested sampling, into Turing without too much hassle.

I'm not sure about the MH-like functions. But I might miss something. Maybe instead we could have simple access to internal samplers (MH, IS) and sample a value using mh(model, vn) -> Real?

I think we should also think about the interface functions we would need for VI. @torfjelde what do you think?

trappmartin on 13 Aug 2019

I think we should also think about the interface functions we would need for VI. @torfjelde what do you think?

Sorry, I completely missed this mention; didn't see it until yesterday. Also not sure if this discussion is still ongoing?

Anyways, for VI in particular the logjoint(model, vi) method proposed would be very useful. In fact I've just implemented such a method internally in my new VI PR for convenience (not meant to be exported):
https://github.com/torfjelde/Turing.jl/blob/547bb7b0f24d015371f0470345b737e52af95123/src/variational/advi.jl#L95-L100

Also a way of tracking the logpdf for each assume and observe call would be great since it would allow computation of p(x | z) and p(z). This is related to #817.

Worth noting that for using vi the Model is already instantiated, so tracking logpdf computations by adding a @track logpdf into the model wouldn't be too helpful in the VI-case.

Similarly, the parameters and parameters! functions would also be useful. Internally it's okay to just extract the parameters "manually" from the VarInfo, but for users to construct their own variational posteriors (other than mean-field approx) these methods would make it much more intuitive.

The @track macro mentioned is also something I've personally found myself wanting at times (not for VI-purposes though). I have some past experience using pymc3 in Python, and found myself using such a feature quite often. It's of course not necessary since these determinstic values can be computed with the samples of the random variables, but it's quite convenient.

torfjelde on 13 Sep 2019

👍1

Yeas, you are right. Things like @track are mostly relevant for the user and not so much for the internals or devs.

trappmartin on 13 Sep 2019

Another bit of low-hanging fruit is to make a function model(spl). Currently the signature is model(vi, spl), but since all the samplers have a vi we can just reduce the call size.

cpfiffer on 17 Sep 2019

🎉1

In reference to this comment:

Yes; I added it here a few days ago. I feel we can have a really flexible/advanced MH sampler in Turing, by falling back to different MH backends (e.g. Emsemble, RAM, vanilla MH).

I wonder if we should spin off an "AdvancedMH" package to show people how you might build a Turing-interfaced library.

cpfiffer on 18 Sep 2019

Sounds good to me, but maybe we can do the implementation in the Turing repo first? This avoids the pain of keeping two packages synced.

yebai on 18 Sep 2019

I'm going through some of these suggestions while I'm updating the MH code, and I'm wondering if we can just give everyone default observe/assume statements that just mostly work all the time. Is anyone opposed to updating the ones here a little?

Specifically, I'd like to do two things:

Remove the type constraints so that this function catches all samplers where they do not have a custom observe/assume.
Implement an insupport call in order to return -Inf when something's out of bounds, or adding an insupport check to the logpdf_with_trans call.

The reason I want a generically useful observe/assume set is that a lot of the inference logic we currently do in assume (take MH for example) should probably not be the go-to for developers. The main place for inference should be step!.

Additionally, most inference algorithms are already implementing the most basic possible assume statement (HMC) so we could really just remove a lot of these sampler-specific definitions and just stick with one well-designed default that new developers can just default to without thinking too hard about it.

Thoughts on that?

cpfiffer on 25 Sep 2019

👍1

Woops, didn't mean to close that. Sorry.

cpfiffer on 25 Sep 2019

What about having an abstract type AbstractSimulationBasedSampler similar to the Hamiltonian one and define assume and observe statements for both? But in that case the assume for AbstractSimulationBasedSampler has to be much more flexible then the one you pointed at so that we can handle e.g. #906 in the future.

Maybe it is better to make an extra PR for this?

trappmartin on 25 Sep 2019

Probably. I think we basically just need to implement the observe/assume pairs that turns the model block into a log-density function, because the newer interface stuff lends itself a lot better to cases where step! functions spit out immutable Transition structs instead of doing the state mutation we currently do everywhere.

I have a prototype function with the signature Transition(model::Model, spl::Sampler, theta::T) which just sets the VarInfo to theta and the runs the model to get the PDF. I like the work flow a lot more.

cpfiffer on 25 Sep 2019

What about having an abstract type AbstractSimulationBasedSampler similar to the Hamiltonian one and define assume and observe statements for both? But in that case the assume for AbstractSimulationBasedSampler has to be much more flexible then the one you pointed at so that we can handle e.g. #906 in the future.

I should clarify that my proposal has little to do with dispatch on a sampler type -- it should default to basically the following, regardless of sampler type:

function assume(. . . )
    return vi[vn], logpdf(dist, vi[vn])
end

This is just something helpful for foreign developers. If they want to manually modify the contents of the VarInfo, then this essentially turns the model block into a big log density function and they don't even have to import observe or assume. People who really like the way we do assume now can overload their own version, but I think just relaxing the type constraints and providing a super general function is really helpful for different types of workflows.

cpfiffer on 25 Sep 2019

Having this default assume and observe makes a lot of sense to me. I think we should provide two abstract types, one with assume and observe falling back to the current HMC's and one with them falling back to the current IS's. This should cover a lot of the sampling algorithm as a default.

xukai92 on 27 Oct 2019

👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Variational Bayes inference?

skanskan · 5Comments

Compiler performance

mohamed82008 · 4Comments

Doc: more informative visualisation about MCMC

yebai · 4Comments

`MethodError: no method matching (::BallTreeDensity)(::Array{ForwardDiff.Dual....` when trying to use a KernelDensityEstimate BallTreeDensity as prior

ClaudMor · 3Comments

Distributions not compatible with AD (ForwardDiff/Flux.Tracker)

yebai · 5Comments