How can I get cross_validation and metrics on monthly data? There's only a closed issue for such workarounds in #586 Here's my code:
m = Prophet(seasonality_mode='multiplicative',
interval_width=0.95).fit(train_set)
future = m.make_future_dataframe(periods=12, freq='MS')
forecast = m.predict(future)
forecast = forecast[['ds','yhat_lower', 'yhat', 'yhat_upper']]
How can I get cross-validation in Python with monthly data?
Hi PyDataBlog,
I faced the same issue today, below there is my solution (inspired by the following link: https://towardsdatascience.com/implementing-facebook-prophet-efficiently-c241305405a3)
from fbprophet.diagnostics import cross_validation
cv_results = cross_validation(
model = m,
initial = pd.to_timedelta(12,unit="M"),
horizon = pd.to_timedelta(12,unit="M"))
Essentially you have to use the "cross_validation" method specifying in a _pandas.Timedelta_ style the forecasting horizon that the CV procedure has to use and (not mandatory) the initial period of training.
In my code I used an initial period of 1 year to catch the yearly seasonality of my data and an horizon of 1 year.
In cv_results you will have _y_ and _yhat_ that you can use to evaluate your model with any metric you like.
+1 for @mesk4217's code. Other than setting the horizon/initial appropriately for monthly data, there shouldn't really be any difference in the cross validation with monthly data vs. daily data.
I want to forecast 15 min data by taking last 1 hr data , so what unit should I give in "initial " and "horizon" ? eg: I want initial = 1 hour and horizon = 15 min , so what will be the unit ?
Is the below code correct?
cv_results = cross_validation( model = m, initial = pd.to_timedelta(1,unit="H"), horizon = pd.to_timedelta(15,unit="Min"))
Can anyone plz help me in this ?
@stutig14 Sure, you can also just put it in as a string and pandas will convert it for you. Like,
cross_validation(model=m, initial='1 hour', horizon='15 minutes')
Just to be clear, this will make a series of forecasts, each providing an estimate of forecast error. The first one will use 1 hour of history, and will make a prediction for the next 15 minutes. It will then increase the history size by the amount specified as period (if unspecified as here, defaults to horizon/2, so 7.5 minutes) and makes another forecast. That is, it will use the first 67.5 minutes of history and will again forecast out 15 minutes. This process is repeated with the history size growing until there are less than 15 minutes past the history. This is slightly different than using only the last 1 hr of data at every cutoff, so wanted to make sure that was clear to you.
@bletham Thanks for your kind help.
Can you plz clear one more doubt , in the following code --
m = Prophet()
model = m.fit(df)
future = m.make_future_dataframe(freq='min',periods=15)
forecast = m.predict(future)
cross_validation(model=m, initial='1 hour', horizon='15 minutes')
Doubts--
1. In cross validation, do we have to give model = model i.e. fitted prophet model or model = m i.e. Prophet object?
2. And in "m.make_future_dataframe(freq='min',periods=15)" ,as I want to forecast 15 min data by taking last 1 hr data , so what unit should I give in "freq " ?
3. Can I give "future = m.make_future_dataframe(freq='15min',periods=1)" ? This will give me 1 forecasted value for that 15 min right?
4. And if I give "future = m.make_future_dataframe(freq='min',periods=15), then it will give 15 forecasted values for next 15 min , right?"
5. In "forecast = m.predict(future)", do we have to give "m.predict()" or "model.predict()" ?
Is the above code for cross validation and future is correct?
Thanks in advance.
m.predict uses all data to make a prediction. So if you construct a future dataframe with the next 15 minutes, it will make predictions for the next 15 mins but using all data, not just the last 1 hr. If you want to only use the last 1 hr to make predictions, then you need to filter the history dataframe (df) to only have the last hour of data before you fit the model. You can see the documentation for more about the future dataframe here: https://facebook.github.io/prophet/docs/quick_start.html#python-apifuture.tail() and you can see what dates predictions are being made for. In this case that is correct.future.tail() you will see that it will have a prediction for each of the 15 minutes following the end of the history.m.fit(df) modifies m in-place, and then returns m (so operations can be chained). Here, m and model are the same (refer to the same object in memory). Everything can be with m. In fact you could just dom = Prophet()
m.fit(df) # no need to name the output here
Thanks a lot @bletham for your help.
from fbprophet.diagnostics import cross_validation
cv_results = cross_validation( model = m, initial = pd.to_timedelta(12,unit="M"), horizon = pd.to_timedelta(12,unit="M"))
Thank for the help @mesk4217 . What would you do for quarterly history data trying to predict a couple of quarters forward? Enjoiyed your article.
@leeprevost You found any solution to quarterly historic data?
@DivJ I did some fairly extensive cross validation on a quarterly dataset of stock financial histories. I found these to be the best parameters to get effective predictions on revenues:
{'changepoint_prior_scale': 0.6,
'n_changepoints': 6,
'seasonality_mode': 'additive',
'seasonality_prior_scale': 0.0001,
'growth': 'logistic',
'changepoint_range': 0.8,
'holidays_prior_scale': 0,
'mcmc_samples': 0,
'weekly_seasonality': False,
'daily_seasonality': False}
You don't want weekly or daily seasonality if you don't have that level of granulatrity in your data.
Hi PyDataBlog,
I faced the same issue today, below there is my solution (inspired by the following link: https://towardsdatascience.com/implementing-facebook-prophet-efficiently-c241305405a3)
from fbprophet.diagnostics import cross_validation
cv_results = cross_validation( model = m, initial = pd.to_timedelta(12,unit="M"), horizon = pd.to_timedelta(12,unit="M"))Essentially you have to use the "cross_validation" method specifying in a _pandas.Timedelta_ style the forecasting horizon that the CV procedure has to use and (not mandatory) the initial period of training.
In my code I used an initial period of 1 year to catch the yearly seasonality of my data and an horizon of 1 year.In cv_results you will have _y_ and _yhat_ that you can use to evaluate your model with any metric you like.
Doesn't work any more. Sadly
"Units 'M' and 'Y' are no longer supported, as they do not represent unambiguous timedelta values durations."
Hi PyDataBlog,
I faced the same issue today, below there is my solution (inspired by the following link: https://towardsdatascience.com/implementing-facebook-prophet-efficiently-c241305405a3)
from fbprophet.diagnostics import cross_validation
cv_results = cross_validation( model = m, initial = pd.to_timedelta(12,unit="M"), horizon = pd.to_timedelta(12,unit="M"))
Essentially you have to use the "cross_validation" method specifying in a _pandas.Timedelta_ style the forecasting horizon that the CV procedure has to use and (not mandatory) the initial period of training.
In my code I used an initial period of 1 year to catch the yearly seasonality of my data and an horizon of 1 year.
In cv_results you will have _y_ and _yhat_ that you can use to evaluate your model with any metric you like.Doesn't work any more. Sadly
"Units 'M' and 'Y' are no longer supported, as they do not represent unambiguous timedelta values durations."
i have the same issue
please re-open the issue.
Pandas to_timedelta no longer support 'M' for month.
@leeprevost You found any solution to quarterly historic data?
Hi @DivJ and @leeprevost
I am working on a data set with Quarterly data and my goal is to forecast 4 Quarters.
Did you guys find any solution on how to forecast for Quarterly time periods using fbprophet?
Train Data:
YearQtr | A
-- | --
2018-01-01 | 2.92
2018-04-01 | 8.9
2018-07-01 | 6.3
2018-10-01 | 4.2
2019-01-01 | 5.1
2019-04-01 | 3.8
2019-07-01 | 3.8
2019-10-01 | 4.44
Test Data: (A_pred is to be forecasted)
YearQtr | A_pred
-- | --
2020-01-01 | 聽
2020-04-01 | 聽
2020-07-01 | 聽
2020-10-01 | 聽
I tried
cv_results = cross_validation( model = m, horizon = pd.to_timedelta(4, unit="QS"))
and
cv_results = cross_validation(model = m, horizon='4Q')
but no luck!!
TIA.
@sunnyguntuka a quarter is not a fixed amount of time, and so you can't use it as a timedelta.
Your best bet will be to specify horizon in days, and then directly specify the cutoff dates to be with whatever quarter frequency you want. This is a new option as of version 0.7; it is described in the documentation https://facebook.github.io/prophet/docs/diagnostics.html . Basically you just past the dates in as the cutoffs arg, and don't specify initial or period.
Most helpful comment
Hi PyDataBlog,
I faced the same issue today, below there is my solution (inspired by the following link: https://towardsdatascience.com/implementing-facebook-prophet-efficiently-c241305405a3)
from fbprophet.diagnostics import cross_validationcv_results = cross_validation( model = m, initial = pd.to_timedelta(12,unit="M"), horizon = pd.to_timedelta(12,unit="M"))Essentially you have to use the "cross_validation" method specifying in a _pandas.Timedelta_ style the forecasting horizon that the CV procedure has to use and (not mandatory) the initial period of training.
In my code I used an initial period of 1 year to catch the yearly seasonality of my data and an horizon of 1 year.
In cv_results you will have _y_ and _yhat_ that you can use to evaluate your model with any metric you like.