I have about 150 procedures, each with its specificities (for instance, seasonality, duration, price, need for human resources). By making a Prophet model for each procedure, it is possible to accurately predict the quantity of each procedure per hour. I believe that I can add the yhat s to know the total number of procedures. However, how is it possible to know the uncertainty interval of the sum of the predictions? I don't believe it would simply be sum of yhat_lower and yhat_upper, respectively. Is it?
I tried 'calculating aggregate confidence intervals for forecasts' but I lost myself while taking the variance of a sum of about 150 variables. And the papers I've read about forecasting at scale (for instance, Fine-Grained Time Series Forecasting At Scale With Facebook Prophet And Apache Spark don't talk about uncertainty.
How is it possible to find the uncertainty interval for a sum of predictions?
I think this anwser can help #1727 too.
Thnaks in advance!
What you'd want to do here is directly sample from the predictive posterior, and then sum at the sample level. This would give you samples from the posterior for the sum, from which you could compute intervals. This is the method for getting the posterior samples:
https://github.com/facebook/prophet/blob/e41ed25646f44f713c110c30c07c678e4a07728e/python/fbprophet/forecaster.py#L1401-L1408
Suppose we have a list of the 150 models like model_list = [m1, m2, ...] and we want the sum forecast at the dates in future. Then, do
import numpy as np
# We will store all the samples here
yhat_samples = np.zeros((len(model_list), len(future), 1000))
for i, m in enumerate(model_list):
yhat_samples[i, :, :] = m.predictive_samples(future)['yhat']
# Sum over models. This now has 1000 samples of the posterior for the sum.
yhat_sum_samples = yhat_samples.sum(axis=0)
# Compute point estimate and quantiles
# The posterior mean, or we could use the posterior median too:
yhat_sum = np.mean(yhat_sum_samples, axis=1)
# an 80% interval
yhat_sum_lower, yhat_sum_upper = np.quantile(yhat_sum_samples, [0.1, 0.9], axis=1)
Most helpful comment
What you'd want to do here is directly sample from the predictive posterior, and then sum at the sample level. This would give you samples from the posterior for the sum, from which you could compute intervals. This is the method for getting the posterior samples:
https://github.com/facebook/prophet/blob/e41ed25646f44f713c110c30c07c678e4a07728e/python/fbprophet/forecaster.py#L1401-L1408
Suppose we have a list of the 150 models like
model_list = [m1, m2, ...]and we want the sum forecast at the dates infuture. Then, do