Hi,
I've just achieved a DNN with Keras, but in order to improve my model, I need some optimization overall for all hyper parameters (neuron number, layers, learning-rate etc...), I intend to use both grid search and random search, while i've seen some examples in scikit, but it seems keras isn't compatible with that.
Do I need to implement all by myself?
Or anyone has other idea of getting them?
Thanks a lot.
Here's a little code using hyperopt for optimization of a few parameters of a basic MLP. Adapt or improve as desired!
from hyperopt import fmin, tpe, hp, STATUS_OK, Trials
from sklearn.metrics import roc_auc_score
import sys
X = []
y = []
X_val = []
y_val = []
space = {'choice': hp.choice('num_layers',
[ {'layers':'two', },
{'layers':'three',
'units3': hp.uniform('units3', 64,1024),
'dropout3': hp.uniform('dropout3', .25,.75)}
]),
'units1': hp.uniform('units1', 64,1024),
'units2': hp.uniform('units2', 64,1024),
'dropout1': hp.uniform('dropout1', .25,.75),
'dropout2': hp.uniform('dropout2', .25,.75),
'batch_size' : hp.uniform('batch_size', 28,128),
'nb_epochs' : 100,
'optimizer': hp.choice('optimizer',['adadelta','adam','rmsprop']),
'activation': 'relu'
}
def f_nn(params):
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import Adadelta, Adam, rmsprop
print ('Params testing: ', params)
model = Sequential()
model.add(Dense(output_dim=params['units1'], input_dim = X.shape[1]))
model.add(Activation(params['activation']))
model.add(Dropout(params['dropout1']))
model.add(Dense(output_dim=params['units2'], init = "glorot_uniform"))
model.add(Activation(params['activation']))
model.add(Dropout(params['dropout2']))
if params['choice']['layers']== 'three':
model.add(Dense(output_dim=params['choice']['units3'], init = "glorot_uniform"))
model.add(Activation(params['activation']))
model.add(Dropout(params['choice']['dropout3']))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy', optimizer=params['optimizer'])
model.fit(X, y, nb_epoch=params['nb_epochs'], batch_size=params['batch_size'], verbose = 0)
pred_auc =model.predict_proba(X_val, batch_size = 128, verbose = 0)
acc = roc_auc_score(y_val, pred_auc)
print('AUC:', acc)
sys.stdout.flush()
return {'loss': -acc, 'status': STATUS_OK}
trials = Trials()
best = fmin(f_nn, space, algo=tpe.suggest, max_evals=50, trials=trials)
print 'best: '
print best
In case you're interested, with _hyperas_ you can use jinja-style templates directly within your keras model, instead of having to define the space separately. I've been using this little wrapper for a while and like to think it's pretty useful for quick experiments:
Thank you for the tips, i will both go for a try :)
@jacobzweig I've tried to adapt the example you'v offered for my own model, here comes a little problem:
def space():
space = {'num_layer' : hp.choice('num_layer',[{'layers':'add1'},{'layers':'add2'},
{'layers':'add3'},{'layers':'add4'}]),
'activation' : hp.choice('activation',['ELU(alpha=1.0)','Activation(tanh)']),
'optimizer' : hp.choice('optimizer',['SGD(lr=0.03, decay=1e-7, momentum=0.15, nesterov=True)','RMSprop','Adadelta','Adam']),
'dropout1' : hp.uniform('dropout1',0.25,0.75),
'dropout2' : hp.uniform('dropout2',0.05, 0.5),
'nb_epochs' : 150,
#'units' : hp.quniform('units', 800,1400,2),
'units' : hp.choice('units', [1024,1512,2048,2560]),
'regularizer' : hp.choice('regularizer',['l2','activity_l2']),
}
def model(space,X_train,Y_train,X_test,Y_test):
model = Sequential()
model.add(Dense(output_dim=space['units'], input_dim=X_train.shape[1], init='he_uniform', W_regularizer=l2(l=0.0001)))
print('it is ok add layer')
......
as I run this, it always return an error with
File "mtrand.pyx", line 220, in mtrand.cont2_array_sc (numpy/random/mtrand/mtrand.c:2902)
TypeError: an integer is required
It seems that the error occurred when the 'add(Dense)' is called.
Consider it might be the reason of units not being int, I've tried with hp.choice, hp.quniform, and hp.uniform for units definition, none of these solve that.
Would u give me a hint about the cause of that please?
Sorry - not really sure what your error is. Looks like mtrand is something with numpy... perhaps try updating your numpy installation?
Thanks for the link - It'd be helpful to add an example like this to the docs too.
With the scikit-learn
wrapper, how would you guide the search based on 'best validation score' within a given number of epoch runs? For example, say nb_epoch=100
fixed, but a configuration achieved best validation error at 30, and another configuration achieved it at 50. It seems GridSearchCV will score the model only after the 100 epoch runs.
@jacobzweig Great example code. Question though: it seems that you are optimizing on the AUC of your validation data:
pred_auc =model.predict_proba(X_val, batch_size = 128, verbose = 0)
acc = roc_auc_score(y_val, pred_auc)
Does this not mean that in fact you are training the hyper parameters to learn the correct answer, rather than to predict it? It seems to me that the validation set has now become part of your training data through optimization of the hyper parameters?
Hey @jdelange - you're correct - this is assuming that you have a separate unseen test set. It would be incorrect to report your validation set accuracy after any form of hyperparameter optimization - even slight manual tweaking.
@maxpumperla Hi Max, is it possible to use hyperas with a model that is trained with data-parallelism across multiple GPUs? (i.e. I send separate batches to different GPUs, train the same model, and concatenate the outputs)
hi @dylanrandle, do you want to move this to hyperas? Short answer: it depends what precisely you are doing. As hyperas is just a wrapper for hyperopt, which has a distributed mode using mongodb, this use case is generally covered. In fact, I would recommend using plain hyperopt for this.
@maxpumperla Is it possible to make hyperas compatible with keras 2.x ?
Hi @jacobzweig
best = fmin(f_nn, space, algo=tpe.suggest, max_evals=50, trials=trials)
print 'best: '.
In the above code you are trying to minimize the f_nn function which returns a roc_auc_score. I was just wondering whether we should increase or decrease the roc_auc_score .
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
@hanlianlu Hi, is there any method to optimize the optimizer and it's params at the same time? Say, 'optimizer' : 'optimizer':
'SGD: lr=[0.01-0.08], decay=1e-7, momentum=[0.01-0.15], nesterov=True)'
,'RMSprop': ....
'Adadelta': .....
'Adam'] ....
Hello,
If you are talking about hyper-Params within optimizer, it can be done in
the the same way as with other Params. Back then I was using “hyperas” in
early version of Keras.
But today keras should have hyper parameters optimization implemented
already, you could search a bit in docs.
Best
Hanlian Lyu
On Wed, 28 Feb 2018 at 17:00, TonyWang notifications@github.com wrote:
@hanlianlu https://github.com/hanlianlu Hi, is there any method to
optimize the optimizer and it's params? Say, 'optimizer' : 'optimizer':
'SGD: lr=[0.01-0.08], decay=1e-7, momentum=[0.01-0.15], nesterov=True)'
,'RMSprop': ....
'Adadelta': .....
'Adam'] ....—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/keras-team/keras/issues/1591#issuecomment-369284959,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHE3vP5et7p72jdhvjPGoaisK9ehrhioks5tZXdZgaJpZM4HPNzM
.>
Sent by Hanlian Lu
For hyperparameter optimization with Keras, also try Talos: Hyperparameter Optimization for Keras
disclaimer: I'm the core developer of the package.
Closing as this is resolved, feel free to reopen if problem persists.
Most helpful comment
Here's a little code using hyperopt for optimization of a few parameters of a basic MLP. Adapt or improve as desired!