Ax: High-dimensional BayesOpt: appropriate optimization algorithm for large number of parameters (GPEI is very slow with 128 hyperparameters)

Created on 9 May 2021  路  5Comments  路  Source: facebook/Ax

Hi everybody,

I tried to run Ax for my use case. I have 128 hyperparameters. Unfortunately, sampling new configurations takes a very long time. Here, I did a quick breakdown of what costs how much time:

sobol = Models.SOBOL(exp.search_space)
for i in range(5):
    exp.new_trial(generator_run=sobol.gen(1))

=> 0.23s

eval_data = exp.eval()

=> 1.46s

gpei = Models.GPEI(experiment=exp, data=eval_data)

=> 0.26s

batch = exp.new_trial(generator_run=gpei.gen(1))

=> 390.18s

I would be happy if any of you had some hints on how to speed up the sampling. Is there a better model or sampling strategy to speed things up here?

Thank you for your help.

Best regards,
Felix

question

Most helpful comment

Re TurBO, we have #474 open - though we haven't found the time to work on this much recently.

All 5 comments

Hi @felixhandte ! 128 hyperparameters is a lot -- way more than our standard GPEI algorithm can handle comfortably. cc @bletham , @Balandat , who might be able to discuss alternate approaches with you.

Hi @ldworkin,

thank you for your quick response! I would appreciate any ideas on alternative approaches :)

Best regards,
Felix

Yeah d=128 is much too high for usual BayesOpt. I see three possible options:

  • Try to identify the most important 10 or so and optimize only those (obviously not ideal if there are many interdependencies between parameters and they are all important to some degree, but could get you most of the way there)
  • If they are all continuous parameters you could try the ALEBO high-dimensional optimization method that is built into Ax (https://github.com/facebookresearch/alebo), though it does assume that there is some low-dimensional structure to the search space (e.g. most of the parameters are unimportant) and it's performance will really depend on the degree to which that is true. It will not work well with categorical / choice parameters.
  • You can use the TuRBO method for high-dimensional BayesOpt, which is not in Ax but has an open source implementation here: https://github.com/uber-research/TuRBO . It's probably the best general-purpose high-dimensional BayesOpt method for problems without any particular structure.

Re TurBO, we have #474 open - though we haven't found the time to work on this much recently.

Thank you for your advice. As I have quite some categorical hyperparameters, I will check out TurBO first :)

Was this page helpful?
0 / 5 - 0 ratings