Would it be possible to include a functionality for fitting a model like in GAMLSS? (already mentioned in issue #3915 and #2636)
I am thinking of a simple model with one independent and one dependent variable:
GAMLSS example
It should have the option to choose the link functions and distribution. I am interested in fitting the Box-Cox Cole Green distribution (BCCG) published by Cole (also called LMS method). This method has found application in many medical fields for creating reference curves.
Rigby and Stasinopoulos give enough information in their publications for the implementation (e.g. RS() fitting procedure):
I tried to implement it but was not so lucky so far.
.
Three parts:
In general, I think getting a prototype version to work should not be too difficult and requires loglikelihood and keeping track of and connecting/stacking the parameters.
Analytical derivatives for the optimization will most likely be more work and requires hand coding various partial derivatives that can be combined into the full derivatives using chain rule, e.g. that's what GLM does. In at least an initial version this can be replaced by numerical derivatives.
That feature would definitely attract a lot of statisticians and other scientists.
about the slow: you can replace for i in x loops by a vectorized numpy computation. Also 'nelder-mead' is slow. If the starting values are not very bad, then bfgs or lbfgsb will be much faster. Analytical derivatives will make it numerically more precise and most likely also faster.
I'm not sure whether something like the RS algorithm is necessary. It looks like a lot of work, and I would try first to see how far you can get with the scipy optimizers. IRLS in GLM is a fisher scoring variant, It might be possible to reuse GLM when all other shape parameter are fixed.
(I'm a bit doubtful about any claims that RS works well, especially the cases and datasets are less "nice")
Changing the loop to a vector made it of course a lot faster, from 4.3 to 0.6 seconds, thanks! "Nelder-mead" is quite stable and fast, bfgs and lbfgsb did not really succeed (will try it later again). 1D fitting is now pretty good.
I tried to implement the 2D fitting of a BCPE distribution (Python code). I used a simple model with linear median (M) and the other parameters constant (S, L, T). Optimizing again with "nelder-mead" for five parameters (two for M, and one for S, L, T respectively). Results are promising, I get similar parameters like the gamlss fit:
python fit
M = 45.53*x + 1.935
S = 0.033,
L = 1.04,
T = 1.28
gamlss fit
M = 45.535*x + 1.929
S = 0.0312
L = 0.99
T = 1.29
However, results highly depend on the starting values for M, S, L and T. Initial values have to be picked wisely. Maybe other optimizer will help. Next step should be higher degrees of freedom for the parameters.
adding a reference from citing articles (I didn't look at it but abstract sounds interesting)
Borghi, Elaine, Mercedes de Onis, Cutberto Garza, Jan Van den Broeck, Edward A. Frongillo, Laurence GrummerâStrawn, S. Van Buuren et al. "Construction of the World Health Organization child growth standards: selection of methods for attained growth curves." Statistics in medicine 25, no. 2 (2006): 247-265.
https://doi.org/10.1002/sim.2227
It looks to me like their power exponential distribution is gennorm in scipy, with a slightly different parameterization, normed so that the standard distribution has mean 0 and variance 1.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gennorm.htm
https://en.wikipedia.org/wiki/Generalized_normal_distribution#Version_1
the messy part is that the transformation with y> 0 requires truncation which they seem to ignore mostly in practice
gennorm has an explicit "special" cdf and pdf using gammaincc and gammainccinv
gennorm is symmetric so it might be good to combine with skewing transformations
Most helpful comment
about the slow: you can replace
for i in xloops by a vectorized numpy computation. Also 'nelder-mead' is slow. If the starting values are not very bad, then bfgs or lbfgsb will be much faster. Analytical derivatives will make it numerically more precise and most likely also faster.I'm not sure whether something like the RS algorithm is necessary. It looks like a lot of work, and I would try first to see how far you can get with the scipy optimizers. IRLS in GLM is a fisher scoring variant, It might be possible to reuse GLM when all other shape parameter are fixed.
(I'm a bit doubtful about any claims that RS works well, especially the cases and datasets are less "nice")