Prophet: add_regressor with continuous variable warning: longer object length is not a multiple of shorter object length

Created on 25 Sep 2017 · 5Comments · Source: facebook/prophet

Hi,

I really love your package and happy that add_regressor feature was added. However I faced a strange warning and felt not very comfortable adding several regressors with multiple calls to add_regressor.

__Let's start with the warning:__

Here are two reproducible examples.

The first one is based on tutorial example data

df <- read.csv("https://raw.githubusercontent.com/facebookincubator/prophet/master/examples/example_wp_peyton_manning.csv")

set.seed(42)
# Add random zeros and ones (code from original add_regressors example somehow returns only zeros)
df$nfl_sunday <- sample(c(0, 1), 2905, replace = TRUE)

# Fit model
m <- prophet()
m <- add_regressor(m, 'nfl_sunday')
m <- fit.prophet(m, df)

# Make future data.frame
future <- make_future_dataframe(m, periods = 365)
# Yep I know that is not correct way to add future values in case of randomly generated data
future$nfl_sunday <- sample(c(0, 1), 3270, replace = TRUE)`

In the end I get no trouble. The model is fitted correctly:

Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.
Initial log joint probability = -4.27118
Optimization terminated normally: 
  Convergence detected: relative gradient magnitude is below tolerance

Now let me try an example with continuous regressor:

df <-
read.csv("https://raw.githubusercontent.com/facebookincubator/prophet/master/examples/example_wp_peyton_manning.csv")

set.seed(42)
# Add random normal
df$nfl_sunday <- rnorm(2905)

# Fit model
m <- prophet()
m <- add_regressor(m, 'nfl_sunday')
m <- fit.prophet(m, df)

# Make future data.frame
future <- make_future_dataframe(m, periods = 365)
# Yep I know that is not correct way to add future values in case of randomly generated data
future$nfl_sunday <- rnorm(3270)

After this I get a warning message, though the model seems to have been fitted:

Initial log joint probability = -4.27118 Optimization terminated normally: Convergence detected: relative gradient magnitude is below tolerance Warning message: In sort(unique(df[[name]])) == c(0, 1) : longer object length is not a multiple of shorter object length

Also it would be cool __to add several regressors__ with one call to add_regressor().

Thanks in advance!

bug ready

Source

zhitkovk

Most helpful comment

Thanks for pointing this out, and for the really clean repro steps. You can safely ignore this warning message. It is in some code checking whether the regressor should be standardized or not, and despite the warning it is doing the right thing. It should however be fixed to not raise the warning :-)

Adding multiple regressors at a time seems nice but the interface might be challenging, since then we'd have to potentially specify the other inputs to add_regressor (prior_scale and standardize) separately for each regressor.

bletham on 30 Sep 2017

👍2

All 5 comments

bletham on 30 Sep 2017

👍2

This is fixed in https://github.com/facebook/prophet/commit/feb7be397bc4d4932d03b08ba241d7a538ae03cb.

bletham on 5 Nov 2017

🎉1

The fix has been pushed to CRAN in v0.2.1

bletham on 9 Nov 2017

👍1

To address the second concern of adding multiple regressors with one call, could a similar approach be implemented to what pandas#sort_values does where a list of fields can be passed to sort by and then the ascending parameter takes a list as well where you can specify the direction of the ordering for each field independently?

df.sort_values(by=['col1','col2'], ascending=[True,False])

May seem like overkill since this can be achieved by looping through a list of regressors. But, maybe there'd be a benefit to pull that into this function?

MattConflitti on 8 Jan 2019

That's a reasonable approach, I think I see how that would work. I am though still inclined to still prefer just doing it one-at-a-time to keep the interface simple, at the cost of a few more lines of code to get the model set up. For sorting, there is a dependency (ordering) in the values that I think makes it better to do it in one shot, but that isn't really the case for extra regressors.

bletham on 9 Jan 2019

Was this page helpful?

0 / 5 - 0 ratings