Prophet: R crashes randomly when Prophet models are built in a loop for more than 100 iterations

Created on 7 Mar 2017 · 18Comments · Source: facebook/prophet

I am trying to carry out forecasting for multiple segments. But R(3.3.2) randomly crashes after around 70+ iterations. I don't think this is a RAM issue as R sometimes crashed even after 10 iterations and sometimes is can run the entire 1000+ loops. Its not input data issue as the model is built when built individually. Please help!

The code used is attached for reference.
Prophet in loop.txt

R bug

Source

Gautam12345

👍1

Most helpful comment

I just ran into this issue, too. Looks like the problem is somewhere within predict.prophet(). Looping over the M3 data always breaks at the predict stage, not the call to prophet()

example.txt

max-graham on 8 Mar 2017

👍2

All 18 comments

I may have similar problem. I'm using R version 3.3.1 (2016-06-21) -- "Bug in Your Hair" on Ubuntu 16.04. Installation went OK but R crashes everytime function prophet() is called.

Code used (example from documentation):
sample.txt

jsanter on 7 Mar 2017

We've gotten this report a couple of times and it's going to require a little bit of digging. Apologies for the inconvenience.

seanjtaylor on 8 Mar 2017

I just ran into this issue, too. Looks like the problem is somewhere within predict.prophet(). Looping over the M3 data always breaks at the predict stage, not the call to prophet()

example.txt

max-graham on 8 Mar 2017

👍2

I'm experiencing the same issue on a Windows 7 machine, with R 3.3.3.

thijsknaap on 10 Mar 2017

Experiencing the same issue on Windows 10 with R 3.3.3

namman2 on 11 Apr 2017

Spent several hours on this yesterday. I have some updates. It's definitely happening in predict and likely happening in predict_uncertainty and goes away if you use a smaller number for uncertainty.samples. I'm going to try GDB as proposed here in order to make some more progress.

seanjtaylor on 11 Apr 2017

@seanjtaylor Thanks a lot for your work on this. I reduced the uncertainty.samples to very low and re-ran but still, the session gets aborted after some random interval. Is there any certain range of numbers for the argument that does not cause R to crash? Thanks again!

namman2 on 12 Apr 2017

Here at SUPERCRUNCH, we have also experienced problems by this Mandelbug, which makes prophet crash non-deterministically after it has been called a number of times. For example, the following loop, repeatedly analyzing a synthetic dataset, may crash as early as in the fourth iteration, while it sometimes takes several hundred iterations before the failure occurs:
```{r}
library(lubridate)
library(data.table)
library(prophet)

ds <- as.POSIXct("2011-01-01") + months(0 : 57)
y <- c(79, 88, 86, 94, 101, 92, 79, 77, 87, 78, 69, 62, 67, 79, 75, 84, 78, 98, 114, 126, 154, 128, 120, 118, 110, 95, 82, 102, 116, 100, 107, 104, 108, 96, 86, 100, 98, 126, 135, 141, 168, 140, 130, 115, 123, 143, 163, 169, 139, 136, 129, 117, 108, 93, 79, 92, 64, 94)
test_data <- data.table(ds = ds, y = y)

for(i in 1 : 1000){
print(i)
res_prophet <- prophet(test_data, growth = "linear", weekly.seasonality = FALSE)
forecast <- predict(res_prophet)
}


When removing the `predict` step the problem goes away, confirming your observation that the failure occurs in this routine. However, the failure seems to be related to the fact that whenever `fit.prophet` is called this routine loads the stan model. 

As a quick and dirty hack, we have adapted `prophet` and `fit.prophet` in a way that allows us to pass the stan model to these routines after loading it once:
```{r}
prophet.adapted <- function (df = df, growth = "linear", changepoints = NULL, n.changepoints = 25, 
    yearly.seasonality = TRUE, weekly.seasonality = TRUE, holidays = NULL, 
    seasonality.prior.scale = 10, changepoint.prior.scale = 0.05, 
    holidays.prior.scale = 10, mcmc.samples = 0, interval.width = 0.8, 
    uncertainty.samples = 1000, fit = TRUE, model = NULL, ...) 
{
    if (!is.null(changepoints)) {
        n.changepoints <- length(changepoints)
    }
    m <- list(growth = growth, changepoints = changepoints, n.changepoints = n.changepoints, 
        yearly.seasonality = yearly.seasonality, weekly.seasonality = weekly.seasonality, 
        holidays = holidays, seasonality.prior.scale = seasonality.prior.scale, 
        changepoint.prior.scale = changepoint.prior.scale, holidays.prior.scale = holidays.prior.scale, 
        mcmc.samples = mcmc.samples, interval.width = interval.width, 
        uncertainty.samples = uncertainty.samples, start = NULL, 
        y.scale = NULL, t.scale = NULL, changepoints.t = NULL, 
        stan.fit = NULL, params = list(), history = NULL)
    prophet:::validate_inputs(m)
    class(m) <- append("prophet", class(m))
    if (fit) {
        m <- fit.prophet.adapted(m, df, model, ...)
    }
    return(m)
}


fit.prophet.adapted <- function(m, df, model, ...) {
    history <- df %>%
        dplyr::filter(!is.na(y))

    out <- prophet:::setup_dataframe(m, history, initialize_scales = TRUE)
    history <- out$df
    m <- out$m
    m$history <- history
    seasonal.features <- prophet:::make_all_seasonality_features(m, history)

    m <- prophet:::set_changepoints(m)
    A <- prophet:::get_changepoint_matrix(m)

    # Construct input to stan
    dat <- list(
        T = nrow(history),
        K = ncol(seasonal.features),
        S = length(m$changepoints.t),
        y = history$y_scaled,
        t = history$t,
        A = A,
        t_change = array(m$changepoints.t),
        X = as.matrix(seasonal.features),
        sigma = m$seasonality.prior.scale,
        tau = m$changepoint.prior.scale
    )

    # Run stan
    if (m$growth == 'linear') {
        kinit <- prophet:::linear_growth_init(history)
        if(is.null(model)) model <- prophet:::get_prophet_stan_model('linear')
    } else {
        dat$cap <- history$cap_scaled  # Add capacities to the Stan data
        kinit <- prophet:::logistic_growth_init(history)
        if(is.null(model)) model <- prophet:::get_prophet_stan_model('logistic')
    }

    stan_init <- function() {
        list(k = kinit[1],
             m = kinit[2],
             delta = array(rep(0, length(m$changepoints.t))),
             beta = array(rep(0, ncol(seasonal.features))),
             sigma_obs = 1
        )
    }

    if (m$mcmc.samples > 0) {
        stan.fit <- rstan::sampling(
            model,
            data = dat,
            init = stan_init,
            iter = m$mcmc.samples,
            ...
        )
        m$params <- rstan::extract(stan.fit)
        n.iteration <- length(m$params$k)
    } else {
        stan.fit <- rstan::optimizing(
            model,
            data = dat,
            init = stan_init,
            iter = 1e4,
            as_vector = FALSE,
            ...
        )
        m$params <- stan.fit$par
        n.iteration <- 1
    }

    # Cast the parameters to have consistent form, whether full bayes or MAP
    for (name in c('delta', 'beta')){
        m$params[[name]] <- matrix(m$params[[name]], nrow = n.iteration)
    }
    # rstan::sampling returns 1d arrays; converts to atomic vectors.
    for (name in c('k', 'm', 'sigma_obs')){
        m$params[[name]] <- c(m$params[[name]])
    }
    # If no changepoints were requested, replace delta with 0s
    if (m$n.changepoints == 0) {
        # Fold delta into the base rate k
        m$params$k <- m$params$k + m$params$delta[, 1]
        m$params$delta <- matrix(rep(0, length(m$params$delta)), nrow = n.iteration)
    }
    return(m)
}

The script copied below, making use of the adapted routines, seems to work fine:
```{r}
library(lubridate)
library(data.table)
library(prophet)

prophet_stan_model <- prophet:::get_prophet_stan_model(model = "linear")
for(i in 1 : 1000){
print(i)
res_prophet <- prophet.adapted(test_data, growth = "linear", weekly.seasonality = FALSE, model = prophet_stan_model)
forecast <- predict(res_prophet)
}
```

Hope this helps.

Michael and Jakub

Mathemagically on 12 Apr 2017

Thanks @Mathemagically! We'll look into this solution more carefully for the v0.2 release. For now, it looks like calling gc after model fitting fixes the bug (just committed this), so we're going to ship that for a temporary bugfix.

seanjtaylor on 13 Apr 2017

I was able to replicate this issue with several loops, and https://github.com/facebookincubator/prophet/commit/c164367c0841c2539e4aef30fbc986a36d465c4e fixed the issue for me. If you currently have this issue, please install the latest version and see if this resolves it:

devtools::install_github('facebookincubator/prophet', subdir='R')

bletham on 13 Apr 2017

Thanks for the temp fix guys. @bletham I re-ran the same on my use case after updating and I can confirm that it does not crash now for less than 1000 loops. There's something I'm working on which needs me to loop over this threshold and running more than 1000 loops causes Prophet to crash R. This has improved a lot since the bug fix but still crashes for my needs. Can you please verify the same and let me know if there's any room for improving this fix? Thanks!

namman2 on 13 Apr 2017

@namman2 just to be clear, it now does not fail during the first 1000 iterations, but somewhere after 1000 it does still segfault?
I personally don't have a good understanding of what is causing the segfault or why gc() helps. If you don't need uncertainty intervals, then reducing uncertainty.samples might postpone the failure. It does seem to be memory related so you might try deleting the model object from memory in the loop rather than overwriting it. But I really don't know what exactly is happening.

bletham on 13 Apr 2017

@bletham Yes, it does segfault somewhere after 1000 iterations now, was more non-deterministic before, but a bit stable now.
And I do not need uncertainty intervals for my use case, so I set the uncertainty.samples to 0. I don't see it anywhere in the documentation and I know that confidence intervals are a part of forecasting but is there a way to disable prophet from completely executing or reporting on the intervals?
Yes, I will definitely try removing the model in the loop itself once it executes. Might be because I'm overwriting it that is causing this issue. Do you suggest to use an rm on just the model object or a gc at the end of the loop?
Thanks again!

namman2 on 13 Apr 2017

@bletham So, I ran it again using your suggestions and it does still segfault at precisely around the 1000th iteration mark. While it was more random previously, now it looks like it crashes as it reaches the 1000th iteration.

namman2 on 13 Apr 2017

@bletham I have installed the latest version, including c164367, and I can confirm that on our machine the failure does not occur when running my original example, even if the loop is carried out for 2000 iterations. However, the garbage collection seems to significantly slow down execution - maybe this is the cause of the problem noticed by @joaopcoelho, assuming that he is using the latest version. (Alternatively, the increasing runtime might be related to the aging behavior in the original prophet release.)

Actually, your hack employs some form of software rejuvenation to counteract the aging effects caused by an aging-related bug, while our hack prevents the aging-related bug from being activated repeatedly. Of course, in the end the best solution is to remove the aging-related bug itself, if possible, but these hacks are already good lines of defense against failure occurrences even in the presence of the fault. So the latest prophet version that you provided is much appreciated.

Mathemagically on 13 Apr 2017

@namman2 thanks for your help in debugging this.
I added a branch in https://github.com/facebookincubator/prophet/commit/551db63cfa3fc73a5b8e6a7b752b0756d8fcd0a6 that stops loading the Stan model each iteration, as @Mathemagically had suggested. I do not get segfaults in a loop of 2000 fits. Can you install this version and see if you do? You'll have to install the loop_debug branch:

devtools::install_github('facebookincubator/prophet', subdir='R', ref='loop_debug')

bletham on 14 Apr 2017

🎉1

Thanks a lot, @bletham and @Mathemagically
I can confirm that the issue has now been resolved for me with executing just over 2000 iterations without segfaults.

namman2 on 14 Apr 2017

Wooo excellent!

seanjtaylor on 14 Apr 2017

Was this page helpful?

0 / 5 - 0 ratings