I wanted to test a very simple example.
This is the code below.
df = read_excel('Test1.xlsx')
features = df.drop('target', axis=1).values
target = df['target'].values
X_train, X_test, y_train, y_test = train_test_split(features, target, train_size=0.8, test_size=0.2)
tpot = TPOTRegressor( generations=50, population_size=50 , n_jobs=1 , verbosity=2 )
tpot.fit(X_train, y_train)
print("Cross Validation(CV) score : {} / 0<= CV score <= 1(perfectly accurate) ".format(tpot.score(X_test, y_test)))
tpot.export('tpot_test1_pipeline.py')
Optimization Progress: 4%|โ | 100/2550 [00:41<29:16, 1.39pipeline/s]Generation 1 - Current best internal CV score: 3.1554436208840474e-31
Optimization Progress: 6%|โ | 149/2550 [01:10<29:32, 1.35pipeline/s] Optimization Progress: 6%|โ | 150/2550 [01:10<29:32, 1.35pipeline/s]Generation 2 - Current best internal CV score: 0.0
Optimization Progress: 8%|โ | 199/2550 [01:37<24:56, 1.57pipeline/s]Optimization Progress: 8%|โ | 200/2550 [01:37<19:39, 1.99pipeline/s]Generation 3 - Current best internal CV score: 0.0
Optimization Progress: 10%|โ | 250/2550 [02:04<16:20, 2.35pipeline/s]Generation 4 - Current best internal CV score: 0.0
Optimization Progress: 12%|โโ | 300/2550 [02:31<09:22, 4.00pipeline/s]Generation 5 - Current best internal CV score: 0.0
Optimization Progress: 14%|โโ | 350/2550 [03:00<12:53, 2.85pipeline/s]Generation 6 - Current best internal CV score: 0.0
Optimization Progress: 16%|โโ | 400/2550 [03:25<17:03, 2.10pipeline/s]Generation 7 - Current best internal CV score: 0.0
Optimization Progress: 18%|โโ | 449/2550 [03:51<16:06, 2.17pipeline/s]Generation 8 - Current best internal CV score: 0.0
Optimization Progress: 20%|โโ | 500/2550 [04:28<12:52, 2.65pipeline/s]Generation 9 - Current best internal CV score: 0.0
Optimization Progress: 22%|โโโ | 550/2550 [04:56<17:51, 1.87pipeline/s]Generation 10 - Current best internal CV score: 0.0
I believe that I ran the Regressor correctly. However, CV score seems very incorrect.
I expected almost 100% coz, it's really simple example.
Please let me know what you think is the problem.
P.S. In the status window, this warming message showed up.
C:\Users\Dane\Anaconda3\lib\site-packages\deap\tools_hypervolume\pyhv.py:33: ImportWarning: Falling back to the python version of hypervolume module. Expect this to be very slow.
"module. Expect this to be very slow.", ImportWarning)
C:\Users\Dane\Anaconda3\lib\importlib_bootstrap.py:205: ImportWarning: can't resolve package from __spec__ or __package__, falling back on __name__ and __path__
"""
Otherwise, everything seemed okay.
After the optimization progress was done, the exported file was made too. It recommended me this - "exported_pipeline = LassoLarsCV(normalize=False)"
Anyway, the real problem is still "0 CV score" :)
It should be 0 for TPOTRegressor. The default scoring function in TPOTRegressor is neg_mean_squared_error, and 0 means 0 MSE. Please check the TPOT API for Regression for more details
oh, thanks a lot! :) I've been running a lot of examples so far haha and every example's CV score converged to 0..! It's such good news! Once again, thanks a lot and have a great day :)!!
Most helpful comment
It should be 0 for
TPOTRegressor. The default scoring function inTPOTRegressorisneg_mean_squared_error, and 0 means 0 MSE. Please check the TPOT API for Regression for more details