Tpot: What is the formula used for r2 scoring in tpot?

Created on 24 Nov 2017 · 8Comments · Source: EpistasisLab/tpot

I am using tpot for regression. My code is:

tpot = TPOTRegressor(scoring="r2", generations=2, population_size=50, verbosity=2, n_jobs=-1)
tpot.fit(trainX, trainY)
print("Score is", tpot.score(testX, testY))

However I get:

Score is 122.641597476

The maximum should be 1.0 according to http://scikit-learn.org/stable/modules/generated/sklearn.metrics.r2_score.html .

Context of the issue

Process to reproduce the issue

Expected result

A score that is at most 1.0

Current result

Best pipeline: LassoLarsCV(ExtraTreesRegressor(input_matrix, bootstrap=True, max_features=0.65, min_samples_leaf=9, min_samples_split=16, n_estimators=100), normalize=False)
Score is 122.641597476

Possible fix

question

Source

lesshaste

Most helpful comment

Yes, after we've experimented more with regression now, I think we need to remove the abs from the score function and simply allow the user to deal with the negative values as they need to.

rhiever on 29 Nov 2017

👍2

All 8 comments

check if you are not using negative log loss, probably yes and you ar doing a maximization instead of minimization

rspadim on 24 Nov 2017

@rspadim I don't fully understand sorry. The full code is at https://bpaste.net/show/cf0b0f75657f . I think you always want to maximize the r2 score but the max possible should be 1.0. I set the scoring function in the line

tpot = TPOTRegressor(scoring="r2", generations=2, population_size=50, verbosity=2, n_jobs=-1)

The mathematical formula for r2 is at http://scikit-learn.org/stable/modules/model_evaluation.html#r2-score .

lesshaste on 24 Nov 2017

I think this issue is that the r2 score in scikit-learn can be negative and even less than -1 for very bad predictions. TPOT internally should maximize the r2 score but the stdout is the absolute score (I think it is mentioned in one issue before).

weixuanfu on 24 Nov 2017

@weixuanfu So is 122.641597476 the absolute value of the r2 score? If so, does it make sense to output this as it doesn't seem informative. I mean -1 and 1 are very different r2 scores afaict.

lesshaste on 24 Nov 2017

Maybe we need reopen #425 to refine the stdout @rhiever. We had seen a few related questions already, like #612.

weixuanfu on 24 Nov 2017

Yes, after we've experimented more with regression now, I think we need to remove the abs from the score function and simply allow the user to deal with the negative values as they need to.

rhiever on 29 Nov 2017

👍2

OK PR #634 was posted.

weixuanfu on 29 Nov 2017

I close this issue since the PR is merged to dev branch. Please feel free to re-open the issue (or comment further) if you have any more questions

weixuanfu on 29 Nov 2017

Was this page helpful?

0 / 5 - 0 ratings