Prophet: Prophet for hourly time series forecast

Created on 25 Feb 2019  路  4Comments  路  Source: facebook/prophet

Hi Devs,

I'm currently testing the prophet package on an hourly time series (dataset from the M4 forecast competition [1]). Interestingly, this dataset doesn't contain any time stamp information (year/month/day/ hour) about the time series and only carries the raw observations. Since the prophet expects a data frame with ds (date type) and y (the time series), I had to synthetically generate a vector of timestamps to accommodate this scenario. Following is an excerpt of the code snippet:

ts <- seq(from = as.POSIXct("2012-05-15 07:00"), length.out = 700, by = "hour")
history <- data.frame(ds = ts, y = time_series_data)
ts_fit <- prophet(history, daily.seasonality = TRUE, weekly.seasonality = TRUE, yearly.seasonality = FALSE)

My question is whether synthetically generating such timestamps would affect the overall performance of the model ?. In other words, if I change the timestamp to a different one, seq(from = as.POSIXct("2013-05-31 08:00"), length.out = 700, by = "hour"), would this affect the forecast accuracy ?

And, also is the correct way of handling the time series, in the absence of timestamp information ? or are there any other alternatives available?

Thanks,
Kasun.

[1] https://www.mcompetitions.unic.ac.cy/the-dataset/

Most helpful comment

Internally in the model, "time" is represented as a float and the dates are scaled so that the history falls on [0, 1]. That happens right here: https://github.com/facebook/prophet/blob/master/python/fbprophet/forecaster.py#L268

So, any constant shift or scaling to the dates in 'ds' will definitely have no effect on the forecast.

The only exception to this would be if you use built-in holidays.

All 4 comments

I haven't downloaded the dataset, but in general my guess is: it depends on what type of temporal relationship your data is holding. You may find the same seasonal relationship, but your exact timing might be off. For example, let's assume, every 2nd day of this data shows extremely high values. Now, if the original data starts from Monday, then you should see high values on every Tuesdays. However, since you don't have timestamps in the original dataset, you may start by assuming that the data starts with a Tuesday. In that case, the high value would be shifted to Wednesdays, instead of Tuesdays.

I'm not an expert on TS, so you may wait for others feedbacks, and their suggestion for best practices/alternatives.

Hi @jbanik85,

Many thanks for sharing your thoughts on this. I agree with you and if it keeps the seasonal relationship intact that way, it shouldn't be affecting the final model (here 'it' refers to the change of the time stamp). Let's hear what others think about this explanation.

Thanks,
Kasun.

Internally in the model, "time" is represented as a float and the dates are scaled so that the history falls on [0, 1]. That happens right here: https://github.com/facebook/prophet/blob/master/python/fbprophet/forecaster.py#L268

So, any constant shift or scaling to the dates in 'ds' will definitely have no effect on the forecast.

The only exception to this would be if you use built-in holidays.

Hi @bletham,

Many thanks for your clarification. I empirically attested this scenario by using different time stamps. As you mentioned this didn't effect on the final forecast

Regards,
Kasun.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

germayneng picture germayneng  路  3Comments

arnaudvl picture arnaudvl  路  3Comments

robertdknight picture robertdknight  路  3Comments

andmib picture andmib  路  3Comments

andrew-pollock picture andrew-pollock  路  3Comments