How can I update the model with new data? I tried pickling the Prophet instance and running fit() on new data but I bump into this "Changepoints must fall within training data" exception. I'd prefer not to disable the automatic changepoint detection. Thanks!
Thanks for pointing this out. The Prophet object wasn't really intended to be re-fit. What is happening here is that the changepoints are being automatically set using the first dataset, but then they don't make sense for the second dataset so it is throwing the error.
Is there a particular reason why you want to re-fit an existing Prophet object instead of just instantiating a new one? Are you trying to get the changepoint locations to be consistent across the multiple fits?
I'm inclined to fix this issue by entirely disallowing a prophet model to be fit twice, so want to know if there is a reason for making that a possibility.
I'm only evaluating Prophet right now, but I'm wondering some operational details. For example, what should I do with a new day's worth of data? Should I fit that along with all the prior data which could be years? Seems like a waste of compute resources to re-fit already fit data. It's probably something silly I'm not understanding.
I see, this is related to #46. Currently you would have to completely re-fit when adding a new day of data.
It does seem wasteful and I like the idea of warm-starting the fit from the current parameters if the object has already been fit to earlier data, although it could potentially be a bit complicated. The main complication is the changepoints and the corresponding fitted rate changes. Imagine we have fit Prophet to data on t=1,...,100 and now we are re-fitting on t=50,...,150. If we are automatically selecting changepoints, then the changepoints in the 2nd dataset will be very different from those in the 1st. This means that the rate change values from the 1st dataset could potentially be really inappropriate for the second dataset, and this 'warm-start' could actually be a bad initialization that harms the fit. If we keep the changepoints constant when moving from dataset1 to dataset2 then the warm-start would be meaningful, and that would work well if we've just added one day of data. But now suppose that we've been running this for 50 days, each day adding one new day of data. We would need to automatically add additional changepoints as the time series gets longer otherwise the performance would degrade compared to a fresh fit. That's where things get a bit complicated and is why this is on the wishlist, especially since the fitting is pretty fast.
34c0f80684b871d49e2112077e62e253e184d430 raises an exception if fit is called on a fitted model which should make this more clear in the future, until we do get some sort of warm-start update figured out.
Oh, that other issue addresses my question - I'll follow that one. That error msg makes it more clear. Thanks!
I think that updating a statistical model for time series is useful when for example you want to compare it with a deep learning model for time series, using rolling window with updated origin. If i want to make a short term forecast, i.e. 30 minutes for one day, it would be unfair for a deep learning model to refit a statistical model and then make the forecast as the deep learning model "sees" only the training data and then is provided with the true values for some given window (t-n, ..., t) to forecast the next 30 minutes. Any comment on this? Thank you in advance.
@bletham, what I wanna know is what if I want to build a generalized model using prophet and getting a good result when I disable automatic selection changepoints and using manual changepoints, but since it fixes the changepoints that make it hard to use that function for another dataset.
Most helpful comment
I'm only evaluating Prophet right now, but I'm wondering some operational details. For example, what should I do with a new day's worth of data? Should I fit that along with all the prior data which could be years? Seems like a waste of compute resources to re-fit already fit data. It's probably something silly I'm not understanding.