Pyro: Significantly different answers for PyMC3 NUTS vs Pyro NUTS

Created on 19 Oct 2018 · 17Comments · Source: pyro-ppl/pyro

Guidelines

NOTE: Issues are for bugs and feature requests only. If you have a question about using Pyro or general modeling questions, please post it on the forum.

If you would like to address any minor bugs in the documentation or source, please feel free to contribute a Pull Request without creating an issue first.

Please tag the issue appropriately in the title e.g. [bug], [feature request], [discussion], etc.

Please provide the following details:

Issue Description

The Pyro NUTS sampler gives significantly different posterior predictions with unrealistically small variance compared to the PyMC3 sampler.

Environment

For any bugs, please provide the following:

MacOS, Python 3.6
Pytorch '0.4.1'
Pyro 0.2.1

Code Snippet

Provide any relevant code snippets and commands run to replicate the issue.

The basic model is this:

def model(X, X_t, t):
    d = X.shape[1]
    d_t = X_t.shape[1]

    a = pyro.sample('alpha', dist.Normal(torch.zeros(1), torch.ones(1)))
    theta = pyro.sample('mu', dist.Normal(torch.zeros(1), torch.ones(1)))
    sigma = pyro.sample('sigma', dist.HalfCauchy(scale=0.2))

    mu = a + theta * t

    if d > 1:
        b = pyro.sample('beta', dist.MultivariateNormal(loc=torch.zeros(d),
                                                        scale_tril=torch.eye(d)))
        mu += X @ b

    if d_t > 1:
        b_t = pyro.sample('gamma', dist.MultivariateNormal(loc=torch.zeros(d_t),
                                                           scale_tril=torch.eye(d_t)))
        mu += X_t @ b_t

    return pyro.sample("obs", dist.Normal(mu, sigma))

def conditioned_model(model, X, X_t, t, y):
    return poutine.condition(model, data={"obs": y})(X, X_t, t)

nuts_kernel = NUTS(conditioned_model, adapt_step_size=True)
mcmc_sampler = MCMC(nuts_kernel, num_samples=5000, warmup_steps=1000)
posterior = mcmc_sampler.run(model, X_static_train, X_dynamic_train, t_train, y_train)

The PyMC3 model is this (it's a class method).

d = len(self._nnz_feat_names[col_name]['static'])

a = pm.Normal('alpha_' + col_name, mu=0, sd=1)
theta = pm.Normal('theta_' + col_name, mu=0, sd=1)
sigma = pm.HalfCauchy('sigma_' + col_name, beta=0.2)

#sigma_sum.append(sigma)

X = self._features_shared[col_name]
t = self._t_shared

mu = a + theta * t

if d > 0:
    b = pm.MvNormal('beta_' + col_name, mu=np.zeros(d), cov=features_static_scale * np.eye(d),
                    shape=d)
    mu += X.dot(b)

if self.add_time_interaction:
    d_t = len(self._nnz_feat_names[col_name]['time'])

    if d_t > 0:
        b_t = pm.MvNormal('gamma_' + col_name, mu=np.zeros(d_t),
                          cov=features_time_scale * np.eye(d_t), shape=d_t)
        X_t = self._features_t_shared[col_name]
        mu += X_t.dot(b_t)

# Likelihood (sampling distribution) of observations.
target = pm.Normal(col_name, mu=mu, sd=sigma, observed=self._train_df[col_name])

After fitting both models, the distribution of parameters is completely different.

enhancement

Source

FedericoV

All 17 comments

@FedericoV Recently, we are improving NUTS sampling with mass matrix adaptation, better terminate condition, and multinomial sampling. We hope that these improvements will available in dev branch in a week. Could you please test your model after that?

fehiepsi on 19 Oct 2018

Will be happy too. I love the model performance and being able to work directly with pytorch instead of theano is so convenient.

In the meanwhile, in case it helps, these are the means and standard deviations for pyro/pymc3:

print (pyro_beta.mean(axis=0))
print (pyro_beta.std(axis=0))
[-0.11745194 -0.04753426 -0.06974159  0.76850206  0.00760912  0.08875213
  0.1079157   0.05584696  0.01298933  0.06958022 -0.23925698 -0.10705016]
[2.2426018e-06 1.3150224e-06 7.8228629e-07 3.3200537e-05 2.3236639e-07
 2.6449156e-06 3.9040128e-06 1.2554272e-06 5.8302658e-07 1.0803126e-06
 6.6608027e-06 2.8684642e-06]

pyro_beta
print (beta0_pymc3.mean(axis=0))
print (beta0_pymc3.std(axis=0))
[ 0.0690354  -0.01569692  0.04986336 -0.0057656   0.10699803  0.04059889
  0.056447    0.00023073  0.01026281  0.02887486  0.06576778  0.00124869]
[0.01148268 0.01372258 0.01123893 0.01681948 0.00897278 0.00666878
 0.00943578 0.00543382 0.00454937 0.00794909 0.00986317 0.00882047]

For PyMC3, I'm using the mass matrix adaptation trick from @dfm (https://dfm.io/posts/pymc3-mass-matrix/) - except I'm using a different estimator of covariance matrix from sklearn.

FedericoV on 19 Oct 2018

👍1

@FedericoV - what is the acceptance rate that you are seeing for this model? It seems to me that you might not have generated enough number of uncorrelated samples due to a low acceptance rate. I would also suggest using PyTorch 0.4.0, because 0.4.1 has a number of distribution related bugs, and running this using pyro dev in a week's time as @fehiepsi suggested.

neerajprad on 19 Oct 2018

I'll definitely try the dev branch in a week time. In the meanwhile, how
do I access the sampler statistics?

I have found the API around inspecting the results of the sampler a little
cumbersome: it seems to require to instantiate a thin class around an
opaque output, and using that to pull out the required parameters.

Still: thanks so much for this awesome package, and I'll be sure to follow
all the advice.

On Thu, Oct 18, 2018 at 4:10 PM Neeraj Pradhan notifications@github.com
wrote:

@FedericoV https://github.com/FedericoV - what is the acceptance rate
that you are seeing for this model? It seems to me that you might not have
generated enough number of uncorrelated samples due to a low acceptance
rate. I would also suggest using PyTorch 0.4.0, because 0.4.1 has a
number of distribution related bugs, and running this using pyro dev in a
week's time as @fehiepsi https://github.com/fehiepsi suggested.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/uber/pyro/issues/1470#issuecomment-431193739, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAmddmX2KsQZKqxexXKmRQtgU8uAV4ODks5umQpdgaJpZM4XvW6i
.

FedericoV on 19 Oct 2018

This is part of the diagnostics reported in the progress bar now. But since you are not using Pyro dev, you should be able to enable logging via:

import logging

logging.basicConfig(format='%(levelname).1s \t %(message)s', level=logging.INFO)

Let me know if that doesn't work.

neerajprad on 19 Oct 2018

I believe your intuition about low acceptance rate was correct. For most of the sampling, it's below 1%. After downgrading pytorch to 0.4 and re-running the examples, I get this:

I    Starting MCMC using kernel - NUTS ...
I    Iteration: 300 [WARMUP]
I    Step size: 0.001384     Acceptance rate: 0.116667
I    Iteration: 600 [WARMUP]
I    Step size: 0.001656     Acceptance rate: 0.058333
I    Iteration: 900 [WARMUP]
I    Step size: 0.001584     Acceptance rate: 0.038889
I    Iteration: 1200 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.029167
I    Iteration: 1500 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.023333
I    Iteration: 1800 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.019444
I    Iteration: 2100 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.016667
I    Iteration: 2400 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.014583
I    Iteration: 2700 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.012963
I    Iteration: 3000 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.011667
I    Iteration: 3300 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.010606
I    Iteration: 3600 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.009722
I    Iteration: 3900 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.008974
I    Iteration: 4200 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.008333
I    Iteration: 4500 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.007778
I    Iteration: 4800 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.007292
I    Iteration: 5100 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.006863
I    Iteration: 5400 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.006481
I    Iteration: 5700 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.006140
I    Iteration: 6000 [SAMPLE]
I    Step size: 0.001715     Acceptance rate: 0.005833

I also tried drawing a bunch of samples from the prior, to see if perhaps I had issues with a grossly misspecified prior - but the prior actually looks quite good.

FedericoV on 19 Oct 2018

👍1

Great, so this should make for a good test case now! Do you mind posting your data (or, better still, the data generating function), so that we can test if this example runs better with some of our recent changes.

neerajprad on 19 Oct 2018

👍1

Unfortunately, I cannot share that particular dataset - I'm really sorry.

I can try to see if I can find another (public) dataset where I notice the
issue - and I'm happy to test it privately on the new branch, even if it's
a pre-alpha branch.

On Fri, Oct 19, 2018 at 10:57 AM Neeraj Pradhan notifications@github.com
wrote:

Great, so this should make for a good test case now! Do you mind posting
your data (or, better still, the data generating function), so that we can
test if this example runs better with some of our recent changes.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/uber/pyro/issues/1470#issuecomment-431445398, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAmddmjlq2LluuDzHMqD17btiEH01jndks5umhGigaJpZM4XvW6i
.

FedericoV on 19 Oct 2018

Unfortunately, I cannot share that particular dataset - I'm really sorry.

Np, it will be really helpful if you could isolate the issue and post a minimal model/dataset which yields unsatisfactory results. You could also try to use Pyro dev right now and let us know if you continue to see this issue.

neerajprad on 19 Oct 2018

Thanks for the help! I just checked out the dev branch, the acceptance rate seems to be much much higher:

1% 61/6000 [00:21<56:14, 1.76it/s, Step size=0.000115, Acceptance rate=0.871]

The sampling is orders of magnitude slower than the previous version, and significantly slower than pymc3. Since the data is not that big (X is roughly 10, 1200) and Y is (1, 1200) - would I get any kind of lift using a GPU? I'm currently testing this on my macbook, so only 4 cores and no GPU.

FedericoV on 19 Oct 2018

👍1

@FedericoV It is slow because the step_size is so small. Maybe you can try turning off adapt_step_size flag and setting step_size to a suitable value.

fehiepsi on 19 Oct 2018

Setting the full_mass=True significantly sped things up, and I'm now experimenting with setting a step size that still has an acceptable 'accept_ratio'. Thanks so much for the help everyone! I'll post a comparison of the results from pyro and pymc3 as soon as the runs are finished.

In terms of performance advice: other than re-parameterization, would having multiple cores or a gpu give a bigger boost?

FedericoV on 19 Oct 2018

@FedericoV When #1443 is merged, multiple cores would be very helpful. About cpu vs gpu, I am not so sure (my laptop has no gpu), could you test it with your data pls?

fehiepsi on 19 Oct 2018

@FedericalV Could you please share if there are still significantly different answers between PyMC3 and Pyro?

fehiepsi on 1 Nov 2018

Hi,

thanks for all the help guys. I'm having a hard time getting exactly the same model because in PyMC3 I use an LKJPrior to specify the full covariance matrix while in pyro I hacked up a low rank solution like so:

    sigma_diag = torch.diag(pyro.sample('sigma_diag', dist.HalfCauchy(scale=torch.ones(c))))
    sigma_lowrank = pyro.sample('sigma_lowrank', dist.Normal(loc=torch.zeros(c, sigma_rank),
                                                             scale=torch.ones(c, sigma_rank) * 0.1))
    sigma_full_unconstrained = sigma_lowrank @ sigma_lowrank.transpose(1, 0) + sigma_diag
    sigma_chol = transform_to(constraints.lower_cholesky)(sigma_full_unconstrained)

FedericoV on 7 Nov 2018

I guess you want to use LowRankMultivariateNormal?

fehiepsi on 7 Nov 2018

This is closed because the comparison is not fair for both frameworks (different models). About performance issue, it is tracked in #1511.

fehiepsi on 27 Nov 2018

Was this page helpful?

0 / 5 - 0 ratings