Tpot: Run Arbitrary Code Every Generation

Created on 13 Feb 2018  路  1Comment  路  Source: EpistasisLab/tpot

I couldn't find this in the api or documentation, so please excuse me if this is trivial. At the end of training a model using tpot, I score the generated model based on mean absolute error and mean squared error. The following demonstrates an example.

x_train, x_test, y_train, y_test = train_test_split(
    x, y, test_size=test_size, random_state=seed
)

tpot = TPOTRegressor(generations=50, population_size=20, verbosity=2)
tpot.fit(x_train, y_train)

# score at the end of training
y_predicted = tpot.predict(scaler_x.transform(x_test))
print('me: ', mean_absolute_error(y_test, y_predicted))
print('mse: ', mean_squared_error(y_test, y_predicted))

I would like to be able to run these scores every generation, ideally through a function passed to either TPOTRegressor or tpot.fit. This might look like the following.

x_train, x_test, y_train, y_test = train_test_split(
    x, y, test_size=test_size, random_state=seed
)

def score_tpot(_tpot):
    y_predicted = _tpot.predict(scaler_x.transform(x_test))
    print('me: ', mean_absolute_error(y_test, y_predicted))
    print('mse: ', mean_squared_error(y_test, y_predicted))

tpot = TPOTRegressor(
    generations=50, population_size=20, verbosity=2, each_generation=score_tpot
)
tpot.fit(x_train, y_train)

Is there currently a way to do something like this that I could not find in the documentation?

Thank you for any time you put into reviewing this question.

question

Most helpful comment

We don't have direct support for this functionality, but it could technically be feasible through the use of the warm_start parameter. Some quick code:

x_train, x_test, y_train, y_test = train_test_split(
    x, y, test_size=test_size, random_state=seed
)

tpot = TPOTRegressor(generations=1, population_size=20, verbosity=2, warm_start=True)
for _ in range(50):
    tpot.fit(x_train, y_train)
    y_predicted = tpot.predict(scaler_x.transform(x_test))
    print('me: ', mean_absolute_error(y_test, y_predicted))
    print('mse: ', mean_squared_error(y_test, y_predicted))

# score at the end of training
y_predicted = tpot.predict(scaler_x.transform(x_test))
print('me: ', mean_absolute_error(y_test, y_predicted))
print('mse: ', mean_squared_error(y_test, y_predicted))

>All comments

We don't have direct support for this functionality, but it could technically be feasible through the use of the warm_start parameter. Some quick code:

x_train, x_test, y_train, y_test = train_test_split(
    x, y, test_size=test_size, random_state=seed
)

tpot = TPOTRegressor(generations=1, population_size=20, verbosity=2, warm_start=True)
for _ in range(50):
    tpot.fit(x_train, y_train)
    y_predicted = tpot.predict(scaler_x.transform(x_test))
    print('me: ', mean_absolute_error(y_test, y_predicted))
    print('mse: ', mean_squared_error(y_test, y_predicted))

# score at the end of training
y_predicted = tpot.predict(scaler_x.transform(x_test))
print('me: ', mean_absolute_error(y_test, y_predicted))
print('mse: ', mean_squared_error(y_test, y_predicted))
Was this page helpful?
0 / 5 - 0 ratings