Turing.jl: Add example illustrating how to run multiple chains.

Created on 25 Feb 2019 · 13Comments · Source: TuringLang/Turing.jl

We currently don't have an example that runs multiple chains. However, I think most examples should actually do this. Maybe we could change the tutorials in this respect?

Running multiple chains is currently possible using:

chains = mapreduce(c -> sample(model_fun, sampler), vcat, 1:num_chains)

which will create a Turing.Chains object that contains samples for all chains.

doc

Source

trappmartin

All 13 comments

I feel like we should instead just add a keyword argument (or something) to sample where sample(model, sampler, chains=3) returns that number of chains. Is that a poor idea? I suspect we'd have to touch a lot of samplers due to the places we have sample defined.

cpfiffer on 28 Feb 2019

Ideally, we can share the same sample function implementation for all samplers. We can probably do this a bit later, for example after we finish the refactoring in https://github.com/TuringLang/Turing.jl/issues/689

yebai on 1 Mar 2019

I'll document this behavior for the moment until we get something more permanent.

cpfiffer on 1 Mar 2019

👍1

I noticed that #582 was closed. Does this plan include running the chains in parallel?

itsdfish on 1 Mar 2019

Currently this is not possible.

You cannot set a random seed for Hamiltonian based samplers for now. However, this will very soon be possible and you will be able to sample chains in parallel if only Hamiltonian based samplers are used.
The Distributions package does not implement all rand functions in Julia but used R as a backend. Therefore, it is not possible to seed the random number generator for all distributions making parallel particle based sampling difficult at the moment.

trappmartin on 1 Mar 2019

👍1

This will show up in the guide shortly. I'll be revisiting some of the tutorials soon after #695 goes through, and I'll add in this functionality to better show how to run multiple chains.

cpfiffer on 2 Mar 2019

🎉1

yebai on 8 Mar 2019

Why do mapreduce and reduce...pmap respond differently to @everywhere?
Although both approaches generate multiple chains only the first one uses parallel processes:

chns = reduce(chainscat, pmap(x->sample(model,sampler), 1:num_chains))
(from https://github.com/TuringLang/Turing.jl/pull/710/commits/bdf5d8cf52970896fb236d9c29e3a5e22e01d8c0)

chains = mapreduce(c -> sample(model, sampler), chainscat, 1:num_chains)
(from docs)

Pindar777 on 21 Apr 2019

I'm not sure what you're asking, exactly. pmap (docs) is a mapping function that uses whatever worker processes are available, while mapreduce (docs) is explicitly serial.

The second case is slightly more numerically accurate, since I believe there's some cases where random numbers are not independent when sampling from Distributions. Not sure if that's been fixed yet.

cpfiffer on 21 Apr 2019

@cpfiffer Thank you for your fast reply and explanation. I'm new to Julia and thought I'm making an error when working with the @everywhere macro. So, pmapstems from the Distrubuted-package and thus works with all workers where mapreduce is base-Julia and hence serial.

Pindar777 on 21 Apr 2019

Yes, that's correct --- we're hoping soon to make it unnecessary to worry about mapreduce and pmap, but at the moment it's a bit clumsy.

cpfiffer on 22 Apr 2019

Both functions are from Julia Base. pmap is from the Distributed module of Base. You can read more on this in the Julia documentation.

trappmartin on 22 Apr 2019

Closed in favour of https://github.com/TuringLang/Turing.jl/issues/746

yebai on 15 May 2019

Was this page helpful?

0 / 5 - 0 ratings