I encountered a weird outcome whilst optimizing my AllenNLP model, which might be tough to reproduce. I miss some rows in trial_params. Surprisingly the table trials contains results for the missing parameters which suggests that the runs were successful.
trials
trial_params
Missing trial_id == 2, trial_id == 3, trial_id == 4. All other rows up to trial_id == 40 are fine.
A table trial_params is filled correctly.
allennlp.common.params.infer_and_cast in AllenNLPExecutor. It doesn't work with null values in jsonnet. In addition, I believe that such casting is not needed anyway since jsonnet was designed in such way that a user is responsible for types casting e.g. parseInt, parseJson (https://jsonnet.org/ref/stdlib.html)I have 4 GPUs, thus I have 4 runs and I want them all to share my SQLite database.
export CUDA_DEVICE=0, export CUDA_DEVICE=1, and so on (https://optuna.readthedocs.io/en/stable/faq.html#how-can-i-use-two-gpus-for-evaluating-two-trials-simultaneously).python optuna_code.pyfrom optuna import Trial, create_study
from optuna.integration.allennlp import AllenNLPExecutor
def objective(trial: Trial) -> float:
# Requires to define CUDA_DEVICE & DEBUG env variable externally to support multi GPU
trial.suggest_categorical("POOLING", ["mean", "cls"])
trial.suggest_float("DROPOUT", 0.0, 0.8)
trial.suggest_float("ALPHA", 0.0, 1.0)
trial.suggest_float("GAMMA", 0.0, 5.0)
trial.suggest_float("LEARNING_RATE", 2e-7, 2e-5, log=True)
trial.suggest_float("WEIGHT_DECAY", 1e-5, 1e5, log=True)
executor = AllenNLPExecutor(
trial=trial,
config_file="./configs/config_name.jsonnet",
serialization_dir=f"/experiments/optuna/{trial.number}",
metrics="best_validation_roc_auc",
include_package=["my_package"],
)
return executor.run()
if __name__ == "__main__":
study = create_study(
study_name="study_name"
storage="sqlite:///results.db",
direction="maximize",
load_if_exists=True,
)
study.optimize(func=objective, n_jobs=1, n_trials=5, show_progress_bar=True)
1) I believe that such bug is hard debug since the next time I run the code it can be fine. It looks like some race condition to me.
2) Missing rows causes hyperparameters importance to fail:

Thanks for the report. It seems like a severe bug, and likely a duplicate to https://github.com/optuna/optuna/pull/1498, considering your number of parallel workers and the missing rows. If possible, could you try the latest master branch and see if the problem persists?
@hvy Thanks for your response. I updated my fork to Optuna 2.0.0 & it seems to work for now (trial_params is correct for the first 4 runs). I will let you know when the whole optimization ends.
I removed only allennlp.common.params.infer_and_cast in AllenNLPExecutor. It doesn't work with null values in jsonnet. In addition, I believe that such casting is not needed anyway since jsonnet was designed in such way that a user is responsible for types casting e.g. parseInt, parseJson (https://jsonnet.org/ref/stdlib.html)
Hey @mateuszpieniak, thank you very much for giving it a try AllenNLPExecutor.
I used infer_and_cast in AllenNLPExecutor for parsing floating points.
(e.g. script and config fails with the error TypeError: Expected embedding_dropout to be numeric.).
As you pointed out, this trick could be removed by guiding users to use parseJson.
I'll update the executor implementation.
Thank you!