Turing.jl: Provide a way to set initial state of HMC

Created on 2 Oct 2018 · 15Comments · Source: TuringLang/Turing.jl

new-feature

Source

xukai92

Most helpful comment

Here is a quick hack for users that should work after the bug fix PR #1281 gets merged.

@model gdemo(x, y) = begin
    s ~ InverseGamma(2,3)
    m ~ truncated(Normal(0.0, sqrt(s)), 0.0, 2.0)
    x ~ Normal(m, sqrt(s))
    y ~ Normal(m, sqrt(s))
end;
model = gdemo(1.0, 1.0);
varinfo = Turing.VarInfo(model);
model(varinfo, Turing.SampleFromPrior(), Turing.PriorContext((m = 1.0, s = 1.0)));
init_theta = varinfo[Turing.SampleFromPrior()]
sample(model, HMC(0.01, 1), 100, init_theta = init_theta)

mohamed82008 on 15 May 2020

👍2

All 15 comments

This would be very helpful and is probably needed for all samplers if a very non-informative prior is used.

For example, sometimes I'd like to use an improper flat prior like Uniform(0, Inf), which I can simulate with an extremely wide uniform distribution. In this case having a good starting value is crucial.

scheidan on 21 Nov 2018

I would really appreciate this feature.
When training a very complex model, it would be helpful to provide the sampler with previously calculated estimate of parameters, in order to avoid gradient errors, local minima and generally just speed things up.

LukasUlrych on 16 Jan 2019

Thanks, @LukasUlrych, @scheidan. We'll take a look at this soon.

yebai on 16 Jan 2019

Moved to https://github.com/TuringLang/AdvancedHMC.jl/issues/11

yebai on 1 Mar 2019

@yebai This is a Turing interface issue. Even without AdvancedHMC we can do this. The problem is how to let user provide the initial value so that we can construct a varinfo for the model .

xukai92 on 1 Mar 2019

I'm posting a minimal example to use a hack to achieve this for people who wants to do this before we formally provide an API

using Turing

@model gdemo(x, y) = begin
  s ~ InverseGamma(2,3)
  m ~ Normal(0,sqrt(s))
  x ~ Normal(m, sqrt(s))
  y ~ Normal(m, sqrt(s))
end

# Call a "dummy" sample just to get `vi` initialised
chn_init = sample(gdemo(1.5, 2), HMC(1, 0.1, 5), save_state=true);
vi = chn_init.info.vi

# All the variable names are in `vi.vns` - you need to find which one you want to change
# I'm using the first one as an example
vn = vi.vns[1]
vi[vn] = [2.0]  # NOTE: you have to assign a vector

# Now do your real sampling
chn = sample(gdemo(1.5, 2), HMC(1000, 0.1, 5), resume_from=chn_init)

xukai92 on 6 Mar 2019

🎉1

I'm posting a minimal example to use a hack to achieve this for people who wants to do this before we formally provide an API

using Turing

@model gdemo(x, y) = begin
  s ~ InverseGamma(2,3)
  m ~ Normal(0,sqrt(s))
  x ~ Normal(m, sqrt(s))
  y ~ Normal(m, sqrt(s))
end

# Call a "dummy" sample just to get `vi` initialised
chn_init = sample(gdemo(1.5, 2), HMC(1, 0.1, 5), save_state=true);
vi = chn_init.info.vi

# All the variable names are in `vi.vns` - you need to find which one you want to change
# I'm using the first one as an example
vn = vi.vns[1]
vi[vn] = [2.0]  # NOTE: you have to assign a vector

# Now do your real sampling
chn = sample(gdemo(1.5, 2), HMC(1000, 0.1, 5), resume_from=chn_init)

Hi guys,

I am using Turing.jl version 0.7.0, and the hack above seems not to work any more. Or is it only applicable for HMC? In fact, I am using the Metropolis Hastings sampler now because I have some troubles with the autodiff (and hence cannot use HMC). Could you guys give me some help?

I think this is a very useful feature. So, it would be very helpful if it is included in the manual (even in the from of a "hack").

Thank you very much.

Best,
Lam

lamhm on 11 Oct 2019

A couple things have moved around on this one. I've updated the posted example to show how to do this in a 0.7+ world.

using Turing

@model gdemo(x, y) = begin
    s ~ InverseGamma(2,3)
    m ~ Normal(0,sqrt(s))
    x ~ Normal(m, sqrt(s))
    y ~ Normal(m, sqrt(s))
end

# Call a "dummy" sample just to get `vi` initialised
# Note that now the sample size of one has been moved
# out of the HMC call.
chn_init = sample(gdemo(1.5, 2), HMC(0.1, 5), 1, save_state=true);

# vi is now stored in .info.spl.state.vi, not .info.vi
vi = chn_init.info.spl.state.vi

# To get the varname out you'll need to extract the vn for a given symbol
# from vi.metadata. This gets the varname for m:
vn = vi.metadata.m.vns[1]
vi[vn] = [2.0]  # NOTE: you still have to assign a vector

# Now do your real sampling
chn = sample(gdemo(1.5, 2), HMC(0.1, 5), 1000, resume_from=chn_init)

You can also do the following, which is much shorter but has the problem that you have to specify all your parameters at once:

chn = sample(gdemo(1.5, 2), HMC(0.1, 5), 1000, init_theta=[2.0, 1.4])

There's been murmurings that we might support a more user friendly syntax like

theta = (m = 4.0, s = 2.0)
chn = sample(gdemo(1.5, 2), HMC(0.1, 5), 1000, init_theta=theta)

but it hasn't been implemented yet. I can give it a shot if you'd like, there seems to be some measure of interest in initializing parameters ergonomically.

cpfiffer on 11 Oct 2019

👍2

Thank you very much, @cpfiffer. The codes above work perfectly.

lamhm on 13 Oct 2019

Can we close this now?

mohamed82008 on 18 Apr 2020

Hi, currently migrating some of my code from AdvancedHMC into Turing but would love to be able to pass in all of the same initialisations as I can easily using AHMC. Commenting here to express my interest in having something like
{julia} theta = (m = 4.0, s = 2.0) chn = sample(gdemo(1.5, 2), HMC(0.1, 5), 1000, init_theta=theta)
exposed to the user.

HarrisonWilde on 11 May 2020

@mohamed82008 I saw you had mentioned somewhere that we now had a good way to input NamedTuples? I have the setnamedtuple function in Turing proper, but it's got a couple issues.

If we have one of these functions ready to go (that's better than setnamedtuple) I can prep a PR real quick.

cpfiffer on 11 May 2020

Sorry @cpfiffer didn't see the mention. We have a way to input NamedTuples but it's not used or exposed in this context yet. We basically need to run the model in the PriorContext with the variables passed in the vars field of the context. model(vi, SampleFromPrior(), PriorContext((m = 1, s = 1))) should do the trick.

mohamed82008 on 15 May 2020

This is also quite generic.

mohamed82008 on 15 May 2020

Here is a quick hack for users that should work after the bug fix PR #1281 gets merged.

@model gdemo(x, y) = begin
    s ~ InverseGamma(2,3)
    m ~ truncated(Normal(0.0, sqrt(s)), 0.0, 2.0)
    x ~ Normal(m, sqrt(s))
    y ~ Normal(m, sqrt(s))
end;
model = gdemo(1.0, 1.0);
varinfo = Turing.VarInfo(model);
model(varinfo, Turing.SampleFromPrior(), Turing.PriorContext((m = 1.0, s = 1.0)));
init_theta = varinfo[Turing.SampleFromPrior()]
sample(model, HMC(0.01, 1), 100, init_theta = init_theta)

mohamed82008 on 15 May 2020

👍2

Was this page helpful?

0 / 5 - 0 ratings