Pyro: Request example/tutorial using GPyTorch in Pyro

Created on 26 Jul 2018  路  7Comments  路  Source: pyro-ppl/pyro

It would be nice to have an example demonstrating how to learn GP hyperparameters in Pyro, using the GPyTorch implementation of GPs. We should be able to add a Pyro example/tutorial by forking the simple GP regression tutorial and replacing the torch optimization with Pyro optimization, possibly adding a prior.

Examples good first issue help wanted

Most helpful comment

@jacobrgardner Thanks for pointing out! I just took a look and was amazed by the performance in LOVE + SKI notebook. Well done!

All 7 comments

cc @gpleiss @jacobrgardner @andrewgordonwilson

It was a great conversation! Because I didn't say much during the talk, I'd like to take this chance to share some of Pyro GP's features which might be helpful for your plan:

  • Beside SVI, Pyro GP also supports HMC.
  • For deep kernel learning, Cornell GPytorch uses low dimensional outputs of a pre-trained neural network to feed into a GP model. On the other hand, Pyro GP can train both network's parameters and GP's hyperparameters at the same time. For example, https://github.com/uber/pyro/blob/dev/examples/contrib/gp/sv-dkl.py illustrates how to achieve it. It just takes a few lines of code to make that combination (lines 97, 98, and 103). The speed is also fast because things are trained in mini-batch.
  • The mean function is also flexible to define. It can be a neural network, and you can use SVI to learn its parameters. This is useful when we want to define a deep GP model and want to make this mean function play the role of 'skip layer' as in the doubly stochastic SVI paper. After my vacation, I'll write a tutorial replicating that paper to illustrate how to compose GP models using Pyro GP.
  • We can seamlessly set priors/constraints for hyperparameters using PyTorch/Pyro's distributions and distributions.constraints modules.
  • If we want to fix a hyperparameter (e.g. lengthscale), we just call kernel.fix_param("lengthscale").
  • Traditional sparse GPR models such as FITC, DTC are available. But I put more weights on variational sparse approach because it can be trained in mini-batch and is compatible to arbitrary likelihood.

From our discussion, I guess what currently lacked from Pyro GP are:

  • KISSGP + LOVE, which has been well developed in Cornell GPytorch.
  • Using CG solve (in addition to the current Cholesky + triangle solve).
  • More sophisticated kernel configurations from Uber GPytorch.

I have read the GPytorch's GP regression tutorial. It seems the first thing to do is to use pyro.module/pyro.random_model on model to register its parameter or Pyro.optim. The loss can be added to a Pyro model to be learned under SVI/HMC using Bernoulli trick. Then, the next thing is to replace/inherit gpytorch.priors.SmoothedBoxPrior to make it compatible with PyTorch/Pyro's Distribution (so log_lengthscale_prior can be learned using pyro.param/pyro.sample).

@fritzo I'm going to have a vacation with my family until the end of September, so I might not be active in the discussion for this integration.

Just to clarify, GPyTorch absolutely trains the network parameters and GP hyperparameters simultaneously. This is largely the point and strength of DKL. In the MNIST example we start by pretraining just to show it is possible, but then the model is trained jointly. In the LOVE + SKI notebook we demonstrate training a deep kernel model from scratch, and we have several DKL CIFAR10 and CIFAR100 models trained end to end from scratch.

Edit: If it is helpful, I just added a tutorial on training a DKL + DenseNet model from scratch on CIFAR10 and CIFAR 100 at https://github.com/cornellius-gp/gpytorch/blob/master/examples/DKL_DenseNet_CIFAR_Tutorial.ipynb 馃槂

@jacobrgardner Thanks for pointing out! I just took a look and was amazed by the performance in LOVE + SKI notebook. Well done!

@jacobrgardner has made a nice tutorial which shows how to do fully bayesian for GPyTorch models. For those who want speed, @martinjankowiak has also made a nice example using NumPyro.

I think we can close this issue now.

Hi there!

@jacobrgardner has made a nice tutorial which shows how to do fully bayesian for GPyTorch models. For those who want speed, @martinjankowiak has also made a nice example using NumPyro.

Unfortunately, both of these links are broken! Does anyone know where these notebooks could be found now...?

@prashjet the GPyTorch-Pyro examples are here and the NumPyro example is here.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

fritzo picture fritzo  路  5Comments

neerajprad picture neerajprad  路  4Comments

martinjankowiak picture martinjankowiak  路  3Comments

neerajprad picture neerajprad  路  5Comments

fritzo picture fritzo  路  4Comments