Turing.jl: Maximum a posterior for Turing models

Created on 22 Nov 2018 · 12Comments · Source: TuringLang/Turing.jl

Excerpt from slack discussions

@goedman

Not sure how often this channel is used but I wonder if Turing has a way to retrieve the model portion passed into the @model ... macro. Before actually running the sampler I would like to compute the maximum a posterior (as is done in McElreath’s book StatisticalRethinking, I would use e.g. Optim.jl).

@mohamed82008

using Turing

@model gdemo(x, y) = begin
s ~ InverseGamma(2,3)
m ~ Normal(0,sqrt(s))
x ~ Normal(m, sqrt(s))
y ~ Normal(m, sqrt(s))
return s, m
end
x = 1.5
y = 2.0
model = gdemo(x, y)
vi = Turing.VarInfo()
model(vi, Turing.SampleFromPrior())

function nlogp(sm)
vi.vals .= sm
model(vi, Turing.SampleFromPrior())
-vi.logp
end

using Optim
sm_0 = Float64.(vi.vals)
lb = [0.0, -Inf]
ub = [Inf, Inf]
result = optimize(nlogp, lb, ub, sm_0, Fminbox())

Related: #421 #72

discussion doc new-feature

Source

yebai

Most helpful comment

I'll add a little section about this behavior.

cpfiffer on 22 Nov 2018

🎉3

All 12 comments

@cpfiffer could you add some doc (perhaps under internals section) on this topic?

yebai on 22 Nov 2018

Hi @yebai I hesitated when I asked the question to switch to Slack. If you prefer that folks ask these questions using Github issues, please let me know.

goedman on 22 Nov 2018

@goedman No worries, either Slack / GitHub works! I moved this discussion here because it's something we want to solve in the near future.

yebai on 22 Nov 2018

I'll add a little section about this behavior.

cpfiffer on 22 Nov 2018

🎉3

I have typed up something very quick, which mostly just liberally copies @mohamed82008's great little hack. Once I send up a pull request I'll probably stick it at the bottom of the Advanced section. Any comments to add in here?

Maximum a Posteriori Estimation

Turing does not currently have built-in methods for calculating the maximum a posteriori (MAP) for a model. This is a goal for Turing's implementation, but for the moment, we present here a method for estimating the MAP using Optim.jl.

using Turing

# Define the simple gdemo model.
@model gdemo(x, y) = begin
    s ~ InverseGamma(2,3)
    m ~ Normal(0,sqrt(s))
    x ~ Normal(m, sqrt(s))
    y ~ Normal(m, sqrt(s))
    return s, m
end

# Define our data points.
x = 1.5
y = 2.0

# Set up the model call, sample from the prior.
model = gdemo(x, y)
vi = Turing.VarInfo()
model(vi, Turing.SampleFromPrior())

# Define a function to optimize.
function nlogp(sm)
    vi.vals .= sm
    model(vi, Turing.SampleFromPrior())
    -vi.logp
end

# Import Optim.jl.
using Optim

# Create a starting point, call the optimizer.
sm_0 = Float64.(vi.vals)
lb = [0.0, -Inf]
ub = [Inf, Inf]
result = optimize(nlogp, lb, ub, sm_0, Fminbox())

cpfiffer on 23 Nov 2018

Thank you Cameron ( @cpfiffer ).

I have been experimenting a bit with creating a function for the lower part. Would you mind taking a quick peek as a sanity check?

using Turing, Optim

function maximum_a_posteriori(model, lower_bound, upper_bound)

  vi = Turing.VarInfo()
  model(vi, Turing.SampleFromPrior())

  # Define a function to optimize.
  function nlogp(sm)
    vi.vals .= sm
    model(vi, Turing.SampleFromPrior())
    -vi.logp
  end

  start_value = Float64.(vi.vals)
  optimize((v)->nlogp(v), lower_bound, upper_bound, start_value, Fminbox())
end


# Define the simple gdemo model.
@model gdemo(x, y) = begin
    s ~ InverseGamma(2,3)
    m ~ Normal(0,sqrt(s))
    x ~ Normal(m, sqrt(s))
    y ~ Normal(m, sqrt(s))
    return s, m
end

# Define our data points.
x = 1.5
y = 2.0

# Set search bounds
lb = [0.0, -Inf]
ub = [Inf, Inf]

model = gdemo(x, y)

result = maximum_a_posteriori(model, lb, ub)
println()

result.minimizer

which returns:

julia> include("/Users/rob/.julia/dev/StatisticalRethinking/chapters/02/map/map05.jl")
loaded
[ Info:  Assume - `s` is a parameter
[ Info:  Assume - `m` is a parameter
[ Info:  Observe - `x` is an observation
[ Info:  Observe - `y` is an observation

2-element Array{Float64,1}:
 0.9074074076939405
 1.1666666666666472

I assume the "loaded" comes from the Turing compiler?

goedman on 24 Nov 2018

Looks sane to me! The loaded line I believe comes from Julia when it loads some package.

cpfiffer on 24 Nov 2018

👍1

Looks like this was removed from the docs. Has MAP been added to Turing yet? Cool to see how easy it is to implement with Optim though.

joshualeond on 15 Feb 2019

Did you use it? I can easily put it back into the TuringModels.jl repo. Having CmdStan, Mamba, DynamicHMC and Turing all in a single project became simply too slow to work on. Let me know.

I typically use a log likelihood function + Optim. But getting it as rock-solid as in quap() in rethinking is tricky w.r.t. the initial values, in e.g. 04/clip-24-29s.jl and 10/m10.02d1.

In fact, I should put it back!

goedman on 15 Feb 2019

Strange. I seem to recall having put the MAP guide into the docs, but the history shows it never actually made it in. I'll add it (back?) to the docs section, at least until we have a more formal implementation for MAP estimation.

cpfiffer on 15 Feb 2019

Hey @goedman, I hadn't used it before. I've been going through McElreath's book with R and thought it would be fun to reproduce it in Julia. Just stumbled across this issue after some searching. I think the DynamicHMC examples are really interesting to see.

joshualeond on 18 Feb 2019

Hi @joshualeond, thank you.

I too have been very impressed by the performance and how to formulate the models in DynamicHMC. In fact I've started to use the logpdf formulations more often in CmdStan. At least one model in DynamicHMC do still run into (AD?) problems (m10.4d.jl is not correct).

I can't wait until (maybe early next year?) there are 2 additional (DynamicHMC and Turing) pure Julia options available for MCMC!

I did add the maximum_a_posterior example back in m2.1t.jl and in the master docs for TuringModels. I hadn't realized Cameron had documented it as well.

goedman on 18 Feb 2019

Was this page helpful?

0 / 5 - 0 ratings