Pyro: Autoguide inference is incorrect for VonMises distribution

Created on 15 Jan 2021  路  4Comments  路  Source: pyro-ppl/pyro

The VonMises distribution sets .support = constraints.real but should really set something like a constraints.circular or other mod-2-pi constraint. This can be an issue in autoguides which fit a variational posterior of the generic form

q = TransformedDistribution(Normal(loc, scale), biject_to(VonMises.support))

since the ELBO's q.entropy() term can incorrectly diverge to infinity because it doesn't know about wrapping around 2 pi.

I believe HMC inference should still be correct, although convergence diagnostic statistics may be incorrect if they don't wrap mod 2 pi.

Possible solution

While I don't know how to cleanly implement something like transform_to(constraints.circular), the issue can be worked around by replacing VonMises with a ProjectedNormal distribution and using ProjectedNormalReparam during variational inference.

bug

All 4 comments

You are right. In numpyro, we set the support to be [-pi, pi] but it raises another concern: when the posterior mode concentrates around pi angle, it will become 2 modals in unconstrain space. Something like constraints.circular would be useful but what will be a good bijective transform for it? Edit: probably there is no such transform: removing two points on a circle creating 2 disconnected parts while that number will be 3 on the real line and 1 on the plane.

I think the right solution is to (1) port Pyro's ProjectedNormal to NumPyro, (2) change VonMises constraint to circular, and (3) omit circular from the biject_to registry, which will trigger an error if it is used in an autoguide. @spinkney the stereographic projection is not a bijection between Euclidean space and a sphere; as @fehiepsi observes those spaces are not homeomorphic hence there can be no diffeomorphism.

You're right that not every value maps, but it's just the origin. You end up going from K parameters to K + 1. This isn't the best paper but interesting concept https://arxiv.org/pdf/1712.07764.pdf.

w_0^2 + ... + w_k^2 = 1
then the transform is f(w_k) = w_k / ( 1 - w_0)

the inverse if f^-1(x_k) = S^2 - 1 / (S^2 + 1) if k = 0
and 2x_k / (S^2 + 1) for all other k

where S^2 = sum(x^2) for all k

Apparently there's this paper On geometric probability distributions
on the torus with applications to
molecular biology
on using stereographic projections. Looks like they characterize a new inverse-stereographic normal and compare to ISNB, Von Mises and Wrapped Normal ((see page 2726).

Anyway, looks like you got it figured out. Cheers

Was this page helpful?
0 / 5 - 0 ratings