Hi,
I am trying to construct a class that initializes the models, updates the GP as new data arrives, and predicts the next point for evaluation, the same steps described in the tutorials, but in an object-oriented way. To start with, I just copied the code from the developer API tutorial and ran the script without any modification, which worked as expected. However, when I fit a botorch model it in a test function of unittest.TestCase, it prints a warning (twice for each gpei construction):
...\Anaconda\envs\...\lib\site-packages\botorch\models\utils.py:189: InputDataWarning:
Input data is not standardized. Please consider scaling the input to zero mean and unit variance.
When I debug the code, it seems that the tutorial also executes the warning, only not printing it (I'm using PyCharm Run tool). As far as I could track, I realize that modelbridge module handles the required transformations, so I'm not sure if I should consider an extra step to standardize the inputs.
Can anyone guide me through?
Hmm, so the modelbridge should properly standardize the inputs if you're using the vanilla dev API tutorial. Let me take a look.
Hmm ok this is interesting, I traced this down to the different default behaviors of np.std (used in the modelbridge) and torch.std (used in the botorch code for validation). Essentially, torch uses the Bessel correction (divide by N-1 instead of N) for the estimate to get an unbiased estimate of the standard deviation of the "parent population".
There isn't really any fancy stats going on here, just normalization, so it doesn't really matter what we use so long as we are being consistent (actually, it probably doesn't matter at all, since things will be internally consistent in BoTorch and the differences will be so tiny that it won't affect the model fitting much if at all). Either way, we should not issue these warnings, so I'll figure out the right place to fix this (Ax or BoTorch) and put up a PR to address this.
Finally, we should also investigate why this warning is not raised in the tutorial, seems like this could be problematic in other less benign cases.
Thank you very much for the detailed explanation.
I was able to raise the warnings in the tutorial script by setting import warnings and warnings.simplefilter("default"). But I don't know if the warnings were filtered by the PyCharm tool by default or it was suppressed by some other means.
Most helpful comment
Hmm, so the modelbridge should properly standardize the inputs if you're using the vanilla dev API tutorial. Let me take a look.