Pymc3: DIC method taking hours

Created on 9 Aug 2017 · 17Comments · Source: pymc-devs/pymc3

Hello,
I'm using the DIC function in a model with 10'000 iterations and 3 chains. It takes hours (~3) to compute the DIC and I found the problem is at this line https://github.com/pymc-devs/pymc3/blob/3232e160dfc728546c4dcedfc0a195831bcd1cf4/pymc3/stats.py#L119.

I tried to parallelize it, but it seems it doesn't parallelize easily. Can you help me?

Source

denadai2

All 17 comments

Yeah there are some recent discussion here about how to speed up evaluation of logp function. Wanna try the solution inside and submit a PR?

junpenglao on 9 Aug 2017

@junpenglao I try :) thx

denadai2 on 9 Aug 2017

ohhh I forgot to fix DIC! (and bpic).
@denadai2 just out of curiosity, why are you using DIC instead of WAIC?

aloctavodia on 9 Aug 2017

@aloctavodia great question :P I suppose I'm using it because I'm interested on describing the parameters instead of predictions (LOO etc)... Is it wrong in this case?

Btw I tried to "cache" logp but it doesn't copy the function. It is very slow to create the list "cached". I'm not an expert, so help me. Meanwhile I did this mess:

def dic2(trace, model=None):
    model = pm.stats.modelcontext(model)
    cached = [(var, var.logp_elemwise) for var in model.observed_RVs]
    cached2 = [(var, var.logp_elemwise) for var in model.free_RVs]

    d_obs = []
    d_frees = []
    for pt in trace:
        d_obs.append(np.sum([logp(pt) for var, logp in cached]))
        d_frees.append(np.sum([logp(pt) for var, logp in cached2]))

    mean_deviance = -2 * np.mean(np.add(d_obs, d_frees))

    free_rv_means = {rv.name: trace[rv.name].mean(
        axis=0) for rv in model.free_RVs}
    deviance_at_mean = -2 * model.logp(free_rv_means)

    return 2 * mean_deviance - deviance_at_mean

Don't kill me but I'm in rush with some experiments and this is 10x faster :D

denadai2 on 9 Aug 2017

@denadai2 actually you just need to replace

mean_deviance = -2 * np.mean([model.logp(pt) for pt in trace])

with

logp=model.logp
mean_deviance = -2 * np.mean([logp(pt) for pt in trace])

junpenglao on 9 Aug 2017

yes, but it doesn't work faster.
It seems that logp is different from elementwiset

denadai2 on 9 Aug 2017

are u sure? I tested it locally and it runs much faster.

junpenglao on 9 Aug 2017

Proof:

%timeit pooled_model.logp(pooled_trace[0])
logp = pooled_model.logp
%timeit logp(pooled_trace[0])
1 loop, best of 3: 1.14 s per loop
1000 loops, best of 3: 421 µs per loop

Here you would assume cached logp is a lot better right? It is not, because the copy takes a lot of time.

%timeit cached = [(pt, pooled_model.logp) for pt in pooled_trace[-20:]]
1 loop, best of 3: 21.8 s per loop
cached = [(pt, pooled_model.logp) for pt in pooled_trace[-20:]]
%timeit mean_deviance = -2 * np.mean([logp(pt) for pt, logp in cached])
100 loops, best of 3: 7.72 ms per loop

So cached assignment takes all the time

denadai2 on 9 Aug 2017

@denadai2 both DIC and WAIC are trying to estimate the same thing, the main difference is that DIC assumes a (multivariate) Gaussian posterior distribution, and WAIC does not make this assumption. @junpenglao is right, the only thing you need to do is to write logp=model.logp and then use logp. Let me know if you are going to do a PR, or if I should do it, and thanks a lot for reporting this issue!

aloctavodia on 9 Aug 2017

@aloctavodia ah ok I try

DIC/WAIC: ok. I thought WAIC was more related to LOO and predictions. I'll look at the paper again :) thx

denadai2 on 9 Aug 2017

@denadai2 You don't need to cache the logp function more than once:

cached = model.logp
%timeit mean_deviance = -2 * np.mean([logp(pt) for pt in trace])

aseyboldt on 9 Aug 2017

👍1

@aseyboldt indeed :P I pushed the fixed version

denadai2 on 9 Aug 2017

👍1

I am a little bit late to the party :) @denadai2, I just want to add a reference to the Watanabe webpage, just in case you want to read more about WAIC (and loo, dic etc).

aloctavodia on 9 Aug 2017

@aloctavodia uhm ok! Thanks. I think I'll open a thread in discourse to ask about the failure of WAIC :))

denadai2 on 9 Aug 2017

@denadai2 Sorry if I missed something, but which failure of WAIC?

aloctavodia on 9 Aug 2017

@aloctavodia

DIC 217.868591309
/home/nadai/.local/lib/python3.4/site-packages/pymc3/stats.py:213: UserWarning: For one or more samples the posterior variance of the
        log predictive densities exceeds 0.4. This could be indication of
        WAIC starting to fail see http://arxiv.org/abs/1507.04544 for details

  """)
WAIC WAIC_r(WAIC=158.42521828487168, WAIC_se=2.3092993247226805, p_WAIC=7.2801294)
/home/nadai/.local/lib/python3.4/site-packages/pymc3/stats.py:278: UserWarning: Estimated shape parameter of Pareto distribution is
        greater than 0.7 for one or more samples.
        You should consider using a more robust model, this is
        because importance sampling is less likely to work well if the marginal
        posterior and LOO posterior are very different. This is more likely to
        happen with a non-robust model and highly influential observations.
  happen with a non-robust model and highly influential observations.""")
LOO LOO_r(LOO=164.91826, LOO_se=2.4069428499178245, p_LOO=10.526650309191908)

But, really, I'll ask in the forum so we don't disturb the others here :))

denadai2 on 10 Aug 2017

👍1

@aloctavodia https://discourse.pymc.io/t/dic-waic-wbic-on-regression-tasks/247 :)) thx

denadai2 on 14 Aug 2017

👍1

Was this page helpful?

0 / 5 - 0 ratings