Prophet: Allow for monthly/annual option for cross_validation()

Created on 22 Jun 2018  Â·  8Comments  Â·  Source: facebook/prophet

Currently, cross_validation() relies on base::as.difftime() to compute time slices. The limitations of the base function is it only rolls up to weekly data.

To allow for monthly and annual dataset, it's ideal to replace with mondate::as.difftime(), which allows units of time 'months' and 'years'.

enhancement

Most helpful comment

@bletham Hi Ben, just wondering what's the status of this enhancement for R?

All 8 comments

That seems reasonable to me, and would better match the Python functionality that does have those options.

Glad to hear @bletham :)

Look forward to updates.

I came here to post this exact issue @leungi ! I am glad someone else had picked it up :)
@bletham , I was trying to evaluate one-step ahead monthly forecast on monthly data, by playing with the period and horizon options.

Does the code below make any sense at all as a work-around? I am basically trying to redefine one unit of time as 30.41667 days since we cannot have unit = "month" at the moment.

 ##Using the example and dataset from the prophet guide

 df <- read.csv("C:/Github/prophet/examples/example_retail_sales.csv")
 m <- prophet(df, seasonality.mode = 'multiplicative')

 ## Use all data for fit and make 10 years ahead foreacast
  future <- make_future_dataframe(m, periods = 120, freq = 'month')
  fcst <- predict(m, future)
  plot(m, fcst)

 # Try: Out-of-sample forecast - train on 20 years initially (expanding after that), 
 #       forecast 1 month ahead 
 # Does this here make sense?

     tscv.myfit <- cross_validation(m, horizon = 365/12, units = "days", period = 362/12,
                                                    initial = 20*12*(365/12))

@APramov what you proposed works great, decimal numbers are not an issue. In the code, these are converted to time difference objects like

as.difftime(horizon, units=units)

so here it would be

as.difftime(365/12, units = "days")

@APramov; kudos for the workaround :+1:

I suppose the drawback is that each month has different number of days?

@bletham Hi Ben, just wondering what's the status of this enhancement for R?

+1
I used the code mentioned here: https://github.com/facebook/prophet/issues/949#issuecomment-487994184

It works, but the horizon returned from performance_metrics are timedelta values, making it difficult to interpret.

I was incorrect above when I stated that the Py version supports monthly/annual options. It uses pd.Timedelta, which is actually even more strict than the R and doesn't support above days.

The reason monthly and annual are not supported for Timedelta/difftime is because they are not fixed units; the length of a month depends on the month. So the logic for subtracting "one month" from a date is going to be quite a bit more complicated than subtracting a fixed amount of time (like a day). So I'm inclined to leave this as-is.

I think the better option for monthly cross validation would be to allow the user to manually specify the locations of the cutoffs, which could then be e.g. all month-end or month-start dates as desired. I'll open a new issue for that.

Was this page helpful?
0 / 5 - 0 ratings