Prophet: Minimum threshold (just like "cap" is a maximum threshold)

Created on 3 Mar 2017  路  11Comments  路  Source: facebook/prophet

I'm trying to forecast turnover, and do not wish to be able to generate negative figures if the trend is a down slope.

We can already set maximum threshold values with the "cap" column. It'd be nice to be able to use a minimum threshold as well.

enhancement

Most helpful comment

I ran into this same issue. My data was multiplicative in nature, so using a BoxCox transformation made sense for me. This made it so the predictions would level out at 0 for the most part and I would clip for any weird outliers in the lower confidence band.

from scipy.stats import boxcox
BOXCOX_LAMBDA = 0.10
df["y"] = boxcox(df['actual'].clip(lower=0) + 1, lmbda= BOXCOX_LAMBDA)

You'll need to inverse the operation before plotting though...

def invboxcox(y,ld):
   if ld == 0:
      return(np.exp(y) - 1)
   else:
      return(np.exp(np.log(ld*y+1)/ld) - 1)
forecast['yhat'] = invboxcox(forecast['yhat'].clip(lower=1), BOXCOX_LAMBDA)
forecast['yhat_lower'] = invboxcox(forecast['yhat_lower'].clip(lower=1), BOXCOX_LAMBDA)
forecast['yhat_upper'] = invboxcox(forecast['yhat_upper'].clip(lower=1), BOXCOX_LAMBDA)

forecast['trend'] = invboxcox(forecast['trend'], BOXCOX_LAMBDA)
forecast['trend_upper'] = invboxcox(forecast['trend_upper'], BOXCOX_LAMBDA)
forecast['trend_lower'] = invboxcox(forecast['trend_lower'], BOXCOX_LAMBDA)

forecast['seasonal'] = invboxcox(forecast['seasonal'], BOXCOX_LAMBDA)
forecast['seasonal_upper'] = invboxcox(forecast['seasonal_upper'], BOXCOX_LAMBDA)
forecast['seasonal_lower'] = invboxcox(forecast['seasonal_lower'], BOXCOX_LAMBDA)

forecast['yearly'] = invboxcox(forecast['yearly'], BOXCOX_LAMBDA)
forecast['yearly_upper'] = invboxcox(forecast['yearly_upper'], BOXCOX_LAMBDA)
forecast['yearly_lower'] = invboxcox(forecast['yearly_lower'], BOXCOX_LAMBDA)

m.history['y'] = invboxcox(m.history['y'], BOXCOX_LAMBDA)

All 11 comments

I ran into this same issue. My data was multiplicative in nature, so using a BoxCox transformation made sense for me. This made it so the predictions would level out at 0 for the most part and I would clip for any weird outliers in the lower confidence band.

from scipy.stats import boxcox
BOXCOX_LAMBDA = 0.10
df["y"] = boxcox(df['actual'].clip(lower=0) + 1, lmbda= BOXCOX_LAMBDA)

You'll need to inverse the operation before plotting though...

def invboxcox(y,ld):
   if ld == 0:
      return(np.exp(y) - 1)
   else:
      return(np.exp(np.log(ld*y+1)/ld) - 1)
forecast['yhat'] = invboxcox(forecast['yhat'].clip(lower=1), BOXCOX_LAMBDA)
forecast['yhat_lower'] = invboxcox(forecast['yhat_lower'].clip(lower=1), BOXCOX_LAMBDA)
forecast['yhat_upper'] = invboxcox(forecast['yhat_upper'].clip(lower=1), BOXCOX_LAMBDA)

forecast['trend'] = invboxcox(forecast['trend'], BOXCOX_LAMBDA)
forecast['trend_upper'] = invboxcox(forecast['trend_upper'], BOXCOX_LAMBDA)
forecast['trend_lower'] = invboxcox(forecast['trend_lower'], BOXCOX_LAMBDA)

forecast['seasonal'] = invboxcox(forecast['seasonal'], BOXCOX_LAMBDA)
forecast['seasonal_upper'] = invboxcox(forecast['seasonal_upper'], BOXCOX_LAMBDA)
forecast['seasonal_lower'] = invboxcox(forecast['seasonal_lower'], BOXCOX_LAMBDA)

forecast['yearly'] = invboxcox(forecast['yearly'], BOXCOX_LAMBDA)
forecast['yearly_upper'] = invboxcox(forecast['yearly_upper'], BOXCOX_LAMBDA)
forecast['yearly_lower'] = invboxcox(forecast['yearly_lower'], BOXCOX_LAMBDA)

m.history['y'] = invboxcox(m.history['y'], BOXCOX_LAMBDA)

In general we recommend transforming your data so it's got a natural lower bound, so the Box-Cox solution is a good one.

But I think in the future we'll probably add the option to specify lower bounds for logistic growth models. I'm tagging this issue as an enhancement and adding it to the wishlist.

I run into same issue, got negative forecast datapoints from daily-data:

image

This seems to have a fairly simple solution. The logistic growth model is a sigmoid which saturates at the value specified in cap, but also saturates at 0. Indeed fitting Prophet to decreasing data with growth='logistic' produces saturation at 0 as shown in the attached notebook output. If you have real data that saturates to some lower bound, please try out offsetting your data so that the lower bound is 0 and see if it gives you something reasonable.

We can handle a minimum saturation value without having to make changes to the model by just treating it as an offset.

The interface should be just like that with 'cap': It is included in history and future dataframes as a column. Unlike 'cap', it must be optional to ensure backwards compatibility. If not provided, it will default to 0 (current behavior).

The implementation will be:

Let's get this in for the upcoming v0.2 release. PRs are welcome if anyone would like to help on this!
minimum_saturation.pdf

https://github.com/facebookincubator/prophet/commit/cc3238acb7ce77d21515fb68311fb7844c886149 adds this in Python by including a column 'floor' with logistic growth, just like the 'cap' column. R and documentation coming soon.

This is now available in the new v0.2, available on CRAN and pypi. Documentation is here:
https://facebookincubator.github.io/prophet/docs/saturating_forecasts.html

If you have any issue with it feel free to reopen here, or open a new issue.

@bletham The current implementation allows to have a floor only if you have an upper capacity.
It would be interesting to allow it to have a floor even if you don't have an upper capacity.
A great example to come back to the paper is that you can't have view that are less than 0 but you might not know the upper capacity.
The doc should in my opinion mention it at least that you can't use a floor if you don't have a cap.

Great point. https://github.com/facebookincubator/prophet/commit/2be8821c957a85b56530074fb0375118e67fc51d adds this to the documentation and I created #307 for making a trend that saturates only with a minimum.

@bletham can you explain a bit more the math and logic of how to use the logistic growth function to enforce a minimum?

@BrianMiner the logistic growth function is a model of population growth and so naturally has a minimum of 0: https://en.wikipedia.org/wiki/Logistic_function

To handle a floor that is different than 0, we just internally subtract the floor from the data, so that on the shifted data the floor is 0 and we use the usual logistic function.

The alternative trend described in #307 is the (confusingly similarly named) logistic loss function (https://stats.stackexchange.com/questions/187625/how-can-logistic-loss-return-1-for-x-0), which also saturates at 0 but does not have an upper bound.

Does that answer the question?

Thanks a lot @bletham , it's totally clear.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

GretaShi3084 picture GretaShi3084  路  3Comments

germayneng picture germayneng  路  3Comments

ChaymaeHarfoush picture ChaymaeHarfoush  路  3Comments

andmib picture andmib  路  3Comments

andrew-pollock picture andrew-pollock  路  3Comments