Hello!
I find the AutoDiagnoalNormal guide a bit hard to work with, since it gives you a single vec of mean values, rather than a set of means with names that match your model's parameter names. The AutoDelta guide on the other hand is great! For example, if you have a model with a mu and beta, AutoDelta gives you auto_mu_loc and auto_beta_loc, while AutoDiagonalNormal just gives you a length two auto_loc param.
Would it be a welcome contribution to submit an AutoDiagonalNormal guide that works more like the AutoDelta? I have one (copied below) that works for my models, but am not sure if I've missed some important things.
Thanks! And thanks for all the work on this, it's really neat to use!
import pyro
import pyro.distributions as dist
import pyro.contrib.autoguide as ag
import torch
from torch.distributions import constraints
# Note the default prefix is '_auto'
from contextlib import ExitStack
class AutoNormal(ag.AutoGuide):
"""
This implementation of :class:`AutoGuide` uses Normal(0, 1) distributions
to construct a guide over the entire latent space. The guide does not
depend on the model's ``*args, **kwargs``.
It should be equivalent to pyro.contrib.autoguide.AutoDiagonalNormal, but
with more convenient names. In AutoDiagonalNormal, if your model has N
named parameters with dimensions k_i and sum k_i = D, you get a single
vector of length D for your mean, and a single vec of length D for sigmas.
This guide gives you N distinct normals that you can call by name.
Usage::
guide = AutoNormal(model)
svi = SVI(model, guide, ...)
"""
def __call__(self, *args, **kwargs):
"""
An automatic guide with the same ``*args, **kwargs`` as the base
``model``.
:return: A dict mapping sample site name to sampled value.
:rtype: dict
"""
# if we've never run the model before, do so now so we can inspect the
# model structure
if self.prototype_trace is None:
self._setup_prototype(*args, **kwargs)
plates = self._create_plates()
result = {}
for name, site in self.prototype_trace.iter_stochastic_nodes():
print(f'param {name} is {site}')
print(f"param {name} has fn {site['fn']}")
print(f"param {name} has value {site['value']}")
print(f"param {name} has value shape {site['value'].shape}")
print(f"param {name} fn has attrs {vars(site['fn'])}")
with ExitStack() as stack:
for frame in site["cond_indep_stack"]:
if frame.vectorized:
stack.enter_context(plates[frame.name])
loc_name = "{}_{}_{}".format(self.prefix, name, 'loc')
scale_name = "{}_{}_{}".format(self.prefix, name, 'scale')
loc_value = pyro.param(
loc_name,
lambda: torch.zeros(site["fn"]._batch_shape + site["fn"]._event_shape),
constraint=site["fn"].support
)
scale_value = pyro.param(
scale_name,
lambda: torch.ones(site["fn"]._batch_shape + site["fn"]._event_shape),
constraint=constraints.positive
)
result[name] = pyro.sample(
name,
dist.Normal(
loc_value, scale_value
)
)
return result
Hi @patrickeganfoley, this in principle sounds very reasonable. Can you provide some more details on how you'd like to use the existing AutoNormal and where the existing AutoDiagonalNormal fails?
So far I've always been able to get by using: (1) the return value of guide() which is a dict mapping site name to sample value; the median.() and .quantiles() methods; and the sites in the trace. I'd be interested to hear what other use case you'd like to handle. One use case I'd like to handle is support for TraceMeanField_ELBO which I believe requires per-site normal distributions as you suggest.
So using guide()[param_name] seems like it would totally work perfectly fine - I just prefer the convenience of having the named parameters show up when you call pyro.get_param_store().get_all_param_names().
Right now, if you use AutoDiagonalNormal, the get_all_param_names() will just include auto_loc and auto_scale. And we might not need a new auto guide or to change the existing guide at all, it might just be we need more examples for how to access the parameters by name. I think a lot of people don't know you can do better than looking through auto_loc and auto_scale by param ordering.
Just updated it to include _batch_shape and _event_shape. Should I set up a PR? Or try to run some of the examples using it?
I'm also not sure what you mean by One use case I'd like to handle is support for TraceMeanField_ELBO - would it require additional logic in the guide to do this?
Sure, feel free to open up a PR and we can continue discussion and start review. Once you have a PR up I'll add a Tasks section with boilerplate stuff like docs and tests etc.
TraceMeanField_ELBO support should come for free with your change. cc @martinjankowiak you might like this.
@abelstam12 (moving discussion here from the PR).
If you want confidence intervals for your parameters, then .quantiles() already provides that functionality in a simple generic way:
ci = guide.quantiles([0.05, 0.95])
ci["x"] # [5%,95%] confidence interval for site "x"
@fritzo of course. Then the methods I proposed are redundant for getting confidence intervals! Thanks for pointing me to the issue, I will keep an eye on it. Since for now I only need to manage a small amount of samples sites, I can get away using a AutoGuideList of normals for each sample site.
Thanks!
@abelstam12 just to be clear, you can usually use a single AutoDiagonalNormal for multiple sites: .quantiles() will give you a dict mapping site name to quantile tensor, and .__call__() will give you a dict mapping site name to sample.
I'm currently finding myself in want of this functionality for when I want to perform inference on two different models, using two different guides that use the same AutoGuide. For example, if I use AutoMultivariateNormal for both of them it will error (if I don't clear the param store in between) because I think it assigns the same param names for both guides. However, if I use AutoMultivariateNormal for one guide and AutoDiagonalNormal, everything is fine. I believe that being able to add names (I think there used to be an argument called prefix that allowed this in a previous release) would fix this.
@bradyneal Interesting, I had not thought of that use case. I think one way to do that is to collect the two guides in a single PyroModule, e.g.
guides = PyroModule()
guides.guide_1 = AutoMultivariateNormal(model_1)
guides.guide_2 = AutoMultivariateNormal(model_2)
...
svi = SVI(model_1, guides.guide_1, ...)
Let me know if this doesn't work for you (ideally at https://forum.pyro.ai 馃檪).
Oh and @patrickeganfoley's AutoNormal is already available in Pyro 1.3 release as of #2050 .
Oooh, yeah I should start using your Forum.
The AutoNormal naming is perfect 馃憤. Will similar naming be coming to AutoMultivariateNormal?
@bradyneal AutoNormal-like naming will not be coming to AutoMultivariatenormal any time soon due to basic design differences. But the PyroModule trick might work for your use case.