Turing.jl: SMC arguments

Created on 28 Feb 2018 · 32Comments · Source: TuringLang/Turing.jl

Sorry if the answer to this question is obvious, but can the SMC function take other arguments than the number of particles, e.g. resampling threshold/method? Also, is it possible to retrieve the weights from the SMC sampler, along with the samples?

discussion new-feature

Source

awellis

Most helpful comment

I can see the benefit but I'm still not convinced as storing weights for a few thousand particles over a reasonable number of iterations would easily generate a large memory overhead. Even with 16bit precision. And if I understand it right, visualizing statistics on the weights would actually be already quite informative. @awellis is this correct?

I'll make a new issue for this as it seems to be a feature many people could be interested in.

trappmartin on 4 Sep 2018

👍2

All 32 comments

Hi !
It's indeed not clearly documented but the objected returned when calling sample() on a model is a Chain. It has a value2 attribute being an array of all samples, each Sample having a weight attribute.

Best,
regards

emilemathieu on 12 Mar 2018

Hi,
I've been trying out Turing again, and as I'm quite new to Julia, how to extract samples and weights isn't obvious to me.

As far as I can, I can obtain a matrix containing all particles from the value attribute of a Chain and in the value2 attribute, I get a Sample object, which contains a weight and a dictionary of samples for each iteration.

It seems that the weight represents the log evidence at each iteration, but is it possible to access the weight of each particle at each iteration? This would be useful, e.g. for plotting a weighted mean of the state variables, if resampling is not performed at each time step, and the weights are not uniform.

Is this currently possible? (Sorry if the answer is obvious)

My other question referred to the resampling method and resampling threshold: is it possible to use any resampling method other than the default (resampleSystematic), and to change the threshold used for resampling?

cheers,
Andrew

awellis on 18 Jun 2018

Hi Andrew !

Indeed Chain.value2 is a list of Sample with the weight being the log evidence at each iteration. You are right, it would be better to have each particle weight at each iterations, you could modify the SMC code to save all particles weights.

Concerning your 2nd question, there is a resampler argument in SMC, for which you can use one of the implemented resampling schemes.

Cheers,
Emile

PS: This should change in the future to be easier to use cf https://github.com/TuringLang/Turing.jl/issues/413.

emilemathieu on 19 Jun 2018

Hi Emile,

thanks for your reply. I will try to modify the SMC code to save the particle weights at each iteration, and I've figured out how to use the resampler argument, thanks.

I'm wondering if there might be a problem with resampling. I've been playing around with a state space model (using SMC), and it seems that no matter how low I set the resampling threshold, the particles are resampled at the first iteration, which results in all particles having the same value.

I have created an example here.

Is this intended behaviour?

cheers,
Andrew

awellis on 20 Jun 2018

Hi Andrew,

In SMC, the resampler_threshold is used here: https://github.com/TuringLang/Turing.jl/blob/master/src/samplers/smc.jl#L59.

I'm not sure why that's happening. You should try to print ess and spl.alg.resampler_threshold * length(particles) to understand why the condition ess <= spl.alg.resampler_threshold * length(particles) is always true.

May be related to the fact that the particles' weights are all initialised to 0.

Cheers,
Emile

emilemathieu on 20 Jun 2018

ok, thanks.

It'll take me a while to figure out how to do that, I'm pretty new to Julia. Any hints on how to print ess?

But even if resampling is performed, this shouldn't result in the particles all being descended from one ancestor, should it?

awellis on 20 Jun 2018

Not necessarily indeed, but it could if you have to few particles and one of them therefore dominates the others (in terms of associated weights).

You can use println("ess: $(ess)").

PS: you need to reload the module Turing so that this change takes effect.

emilemathieu on 20 Jun 2018

println("ess: $(ess)"): that is clear, but how do I get the ess value during execution of the model?

awellis on 20 Jun 2018

If you add println("ess: $(ess)") in the SMC file after computing the ess.
This file can be found in ~/.julia/v0.6/Turing/src/samplers/smc.jl in macOS.

emilemathieu on 20 Jun 2018

ok, thanks, that worked ok. I'll spend some time looking at this.

awellis on 20 Jun 2018

Hi Emile, thanks for your help.

When is the step function used in the SMC sampler?

And how is it possible that there are only a subset of unique particles after the first iteration, if there is no resampling at the first iteration?

I have tried to show what I mean here.

This is obviously a silly example, I just don't understand how each of the 10 particles has one of only three unique values if there is no resampling.

cheers,
Andrew

awellis on 22 Jun 2018

The step method is called when SMC is used within a Gibbs sampler (i.e. CSMC).

This seems weird indeed, i’ll Look into it

Cheers,
Emile

emilemathieu on 22 Jun 2018

hi Emile,

I still think it looks like something is going wrong at the first iteration. Changing the initialisation of the weights doesn't seem to have any effect.

cheers,
Andrew

awellis on 26 Jun 2018

Hi Emile,

sorry to keep bothering you about this, but I'm having problems with modifying the SMC code so that the weights of all particles are returned for each iteration. Currently, the value2 attribute returned after sampling has a weight for each trajectory, i.e. the log evidence for each particle summed over all iterations. Is that correct? I need at least the sum of the log weights for each iteration, or just the weights.

The reason for doing this is that I want to do a model comparison at each iteration during a run of a particle filter. I realise that this may not really be the intended usage of the SMC algorithm, but are there any guidelines on how to proceed?

best wishes,
Andrew

awellis on 24 Aug 2018

Hi Andrew,

Something seems weird indeed.

In SMC things are happening at consume(particles).

Then at each iterations the weights are computed by _, z2 = weights(pc). So you want to replace that line by Ws, z2 = weights(pc) and then save these weights Ws in pc (as a matrix or as you'd like).

Please tell me if that does not help enough.

Cheers,
Emile

emilemathieu on 24 Aug 2018

ok, thanks, I'll try it tonight and let you know.

awellis on 24 Aug 2018

👍1

ok, so far, adding Ws, z2 = weights(pc) works, and I can print the vector of weights to the console at each iteration, but saving the weights in the ParticleContainer is beyond me at the moments. I'd really appreciate any help.

awellis on 27 Aug 2018

You could update ParticleContainer, Consume and sample with the code below.

In _src/core/container.jl_

type ParticleContainer{T<:Particle}
  model :: Function
  num_particles :: Int
  vals  :: Array{T,1}
  logWs :: Array{Float64,1}  # Log weights (Trace) or incremental likelihoods (ParticleContainer)
  logWst :: Array{Array{Float64,1},1}  # Log weights saved BOTH in time and particle index
  logE  :: Float64           # Log model evidence
  # conditional :: Union{Void,Conditional} # storing parameters, helpful for implementing rejuvenation steps
  conditional :: Void # storing parameters, helpful for implementing rejuvenation steps
  n_consume :: Int # helpful for rejuvenation steps, e.g. in SMC2
  ParticleContainer{T}(m::Function,n::Int) where {T} = new(m,n,Array{Particle,1}(),Array{Float64,1}(),Array{Float64,1}(),0.0,nothing,0)
end

function Base.consume(pc :: ParticleContainer)
  @assert pc.num_particles == length(pc)
  # normalisation factor: 1/N
  _, z1      = weights(pc)
  n = length(pc.vals)

  particles = collect(pc)
  num_done = 0
  logWst = zeros(Float64, n)
  for i=1:n
    p = pc.vals[i]
    score = consume(p)
    score = isa(score, ForwardDiff.Dual) ? realpart(score) : score
    if isa(score, Real)
      score += isa(getlogp(p.vi), ForwardDiff.Dual) ? realpart(getlogp(p.vi)) : getlogp(p.vi)
      resetlogp!(p.vi)
      increase_logweight(pc, i, Float64(score))
      logWst[i] = Float64(score)
    elseif score == Val{:done}
      num_done += 1
    else
      error("[consume]: error in running particle filter.")
    end
  end
  push!(pc.logWst, logWst)
  if num_done == length(pc)
    res = Val{:done}
  elseif num_done != 0
    error("[consume]: mis-aligned execution traces, num_particles= $(n), num_done=$(num_done).")
  else
    # update incremental likelihoods
    _, z2      = weights(pc)
    res = increase_logevidence(pc, z2 - z1)
    pc.n_consume += 1
    # res = increase_loglikelihood(pc, z2 - z1)
  end

  res
end

In _src/samplers/smc.j_

function sample(model::Function, alg::SMC)
  spl = Sampler(alg);

  particles = ParticleContainer{Trace}(model)
  push!(particles, spl.alg.n_particles, spl, VarInfo())

  while consume(particles) != Val{:done}
    ess = effectiveSampleSize(particles)
    if ess <= spl.alg.resampler_threshold * length(particles)
      resample!(particles,spl.alg.resampler)
    end
  end
  w, samples = getsample(particles)
  res = Chain(w, samples)
  return res, particles.logWst
end

Then run the sampler and get the results

chain, logWst = sample(StateSpaceModel_1(y, true), SMC(100))

Hope that this helps you !
Cheers

emilemathieu on 27 Aug 2018

👍1

Thank you! I guess these are the unnormalised weights, right? The output (logWst) is a vector of vectors, but the length is equal to n_iterations + 1, with the final weights vector consisting of zeros.

awellis on 27 Aug 2018

They are unnormalised indeed.
Update consume with the version below, you shouldn't have the extra vector or zeros.

function Base.consume(pc :: ParticleContainer)
  @assert pc.num_particles == length(pc)
  # normalisation factor: 1/N
  _, z1      = weights(pc)
  n = length(pc.vals)

  particles = collect(pc)
  num_done = 0
  logWst = zeros(Float64, n)
  for i=1:n
    p = pc.vals[i]
    score = consume(p)
    score = isa(score, ForwardDiff.Dual) ? realpart(score) : score
    if isa(score, Real)
      score += isa(getlogp(p.vi), ForwardDiff.Dual) ? realpart(getlogp(p.vi)) : getlogp(p.vi)
      resetlogp!(p.vi)
      increase_logweight(pc, i, Float64(score))
      logWst[i] = Float64(score)
    elseif score == Val{:done}
      num_done += 1
    else
      error("[consume]: error in running particle filter.")
    end
  end

  if num_done == length(pc)
    res = Val{:done}
  elseif num_done != 0
    error("[consume]: mis-aligned execution traces, num_particles= $(n), num_done=$(num_done).")
  else
    # update incremental likelihoods
    push!(pc.logWst, logWst)
    _, z2      = weights(pc)
    res = increase_logevidence(pc, z2 - z1)
    pc.n_consume += 1
    # res = increase_loglikelihood(pc, z2 - z1)
  end

  res
end

emilemathieu on 27 Aug 2018

👍1

Thank you very much!

awellis on 27 Aug 2018

Hey ! Did you eventually succeeded @awellis ?

emilemathieu on 2 Sep 2018

@emilemathieu
Hi, I did get it to work, thanks. I am currently trying to get it work in Julia 1.0

awellis on 4 Sep 2018

@emilemathieu got it working in Julia 1.0 as well. I forked the Turing.jl repository.

I guess there must be a more elegant way of saving the weights, though, without hacking the sample method.

awellis on 4 Sep 2018

@awellis would it be sufficient to store statistics on the weights?

We could try to provide these in the resulting Chain object for particle-based methods by default. Storing all particle weights for each iteration, however, sounds rather costly.

trappmartin on 4 Sep 2018

@trappmartin could be optional and set to False by default ?

emilemathieu on 4 Sep 2018

If it's not necessary, I would prefer to only store statistics on the particle weights. Are there cases, apart from debugging, where it would be necessary to return the particles and their weights for each iteration?

trappmartin on 4 Sep 2018

@trappmartin @emilemathieu I think the history of the log evidence ( i.e. the sum of the log weights for each iteration) would actually be sufficient.

awellis on 4 Sep 2018

👍1

@awellis's case ?
it would be a huge tool for debugging indeed !

emilemathieu on 4 Sep 2018

I am using the weights so that I can visualize the SMC performance

awellis on 4 Sep 2018

I'll make a new issue for this as it seems to be a feature many people could be interested in.

trappmartin on 4 Sep 2018

👍2

@emilemathieu @trappmartin visualizing statistics on the weights would be fine for me.

awellis on 4 Sep 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Lessons learned and open questions to improve documentation

scheidan · 5Comments

Guide on http://turing.ml/latest/guide.html is broken

trappmartin · 3Comments

JuliaCon hackathon?

hessammehr · 4Comments

Compiler performance

mohamed82008 · 4Comments

Numerical error messages: real problem or just terminal noise?

fredcallaway · 5Comments