In hmmlearn there is a parameter which controls covariance type:
covariance_type : string, optional
String describing the type of covariance parameters to use. Must be one of
“spherical” — each state uses a single variance value that applies to all features.
“diag” — each state uses a diagonal covariance matrix.
“full” — each state uses a full (i.e. unrestricted) covariance matrix.
“tied” — all states use the same full covariance matrix.
Defaults to “diag”.
I cannot find anything similar in pomegranate in the docs or code:
class MultivariateGaussianDistribution(MultivariateDistribution):
# no doc
Tried to use IndependentComponentsDistribution over NormalDistribution but I keep getting errors.
from pomegranate import *
dim = 20
n_component = 10
GeneralMixtureModel(IndependentComponentsDistribution([NormalDistribution] * dim), n_component)
ValueError: must either give initial distributions or constructor
Tried to initiate differently but keep getting these errors
Howdy @chananshgong
Sorry for the delay, I've been inundated with work recently. Currently I only support full covariance matrices, though at some point I'd like to support all types. If you want to use an IndependentComponentsDistribution you currently need to specify the initial parameters. However, this won't use BLAS so it's likely going to be much slower.
If I get time I'll look into a good performing solution soon. I've been working on Bayesian network structure learning recently.
I'd managed to use IndependentComponentsDistribution/NormalDistribution
to achieve "diag" equivalent, building model like this:
n_features = full_fset.shape[-1]
means = np.mean(full_fset, axis=0)
stds = np.std(full_fset, axis=0)
# initial values for all gaussian components
np.random.seed(None)
dist_init = np.random.random((n_states, n_cmps, n_features, 2))
dist_init[..., 0] -= 0.5 # center means to 0.0
for feat_i in range(n_features):
# random init mean in range [-2std, 2std)
dist_init[..., feat_i, 0] *= 4 * stds[feat_i]
dist_init[..., feat_i, 0] += means[feat_i]
# random init std in range 1std/n_components
dist_init[..., feat_i, 1] *= stds[feat_i] / n_cmps
dists = tuple(
pgn.GeneralMixtureModel(list(
pgn.IndependentComponentsDistribution(tuple(
pgn.NormalDistribution(*dist_init[state_i, cmp_i, feat_i, :])
for feat_i in range(n_features)
))
for cmp_i in range(n_cmps)
))
if n_cmps > 1 else
pgn.IndependentComponentsDistribution(tuple(
pgn.NormalDistribution(*dist_init[state_i, 0, feat_i, :])
for feat_i in range(n_features)
))
for state_i in range(n_states)
)
trans_mat = np.random.random((n_states, n_states))
starts = np.ones(n_states)
self.hmm = pgn.HiddenMarkovModel.from_matrix(trans_mat, dists, starts)
hope it provides some clues and you folks help review my usage ;)
Ultimately what I need to do is make it so that Model.from_samples is used throughout the package and make it so that when IndependentComponentsDistribution.from_samples
is used, it appropriately initializes all of the distributions which are passed in.
Thanks for the example @complyue
@complyue I understand that this may be very slow...
@lxkain my approach is to randomize each training attempts to arrive at some surprising (or not) model parameters. I don't get your meaning of slow
, would you share?
I was only referring to the manner in which a diagonal covariance MGD can be constructed, via IndependentComponentsDistribution using NormalDistributions <- slow, according to Jacob.
Yeah. Not only does it currently not use BLAS, but it handles each example individually. A bunch of people have asked for this, it should be higher on my prior queue...
The IndependentComponentsDistribution approach should be much faster as of a month or two ago. Explicitly having options built-in is still on my queue.
This is great news, thank you! Question: what do you mean by the explicitly built-in options?
I mean that I'd like for you to be able to specify "covariance_type=..." in MultivariateGaussianDistributions for when you call fit
or from_samples
.