Tpot: Documentation - TPOT vs sklearn - coverage

Created on 7 Aug 2016  路  6Comments  路  Source: EpistasisLab/tpot

It would be nice to have table with TPOT vs sklearn operators.
AFAIK not all operators from sklearn are included in tpot. It could be used as:

  • roadmap for tpot
  • some dim reduction techniques (e.g. tsne) could be applied without tpot
enhancement need contributor

Most helpful comment

This is our current coverage:

  • [ ] LabelEncoder
  • [ ] NearestNeighbors
  • [ ] OneVsOneClassifier
  • [ ] OneVsRestClassifier
  • [ ] AdditiveChi2Sampler
  • [X] LogisticRegression
  • [ ] PriorProbabilityEstimator
  • [ ] NuSVC
  • [ ] LinearSVR
  • [ ] NeighborsBase
  • [ ] LabelSpreading
  • [ ] LinearModel
  • [ ] AffinityPropagation
  • [ ] FunctionTransformer
  • [ ] ElasticNetCV
  • [ ] SpectralBiclustering
  • [ ] VBGMM
  • [ ] RandomizedLasso
  • [ ] LassoCV
  • [ ] PatchExtractor
  • [ ] GridSearchCV
  • [ ] PLSSVD
  • [ ] MiniBatchSparsePCA
  • [ ] ZeroEstimator
  • [ ] LSHForest
  • [ ] MultiLabelBinarizer
  • [ ] FeatureHasher
  • [X] FastICA
  • [ ] Ridge
  • [ ] LedoitWolf
  • [ ] MiniBatchDictionaryLearning
  • [ ] ElasticNet
  • [ ] Imputer
  • [ ] KernelRidge
  • [x] RandomizedPCA
  • [X] SelectKBest
  • [ ] MultiTaskElasticNet
  • [ ] LassoLarsIC
  • [ ] GraphLassoCV
  • [ ] LabelBinarizer
  • [X] MaxAbsScaler
  • [ ] VotingClassifier
  • [ ] IsotonicRegression
  • [ ] FactorAnalysis
  • [ ] LabelPropagation
  • [X] GradientBoostingClassifier
  • [ ] LogOddsEstimator
  • [ ] SparsePCA
  • [ ] TfidfVectorizer
  • [X] PassiveAggressiveClassifier
  • [ ] RFECV
  • [ ] RandomizedLogisticRegression
  • [ ] AgglomerativeClustering
  • [ ] ScaledLogOddsEstimator
  • [ ] OrthogonalMatchingPursuitCV
  • [ ] SVR
  • [ ] KernelCenterer
  • [X] AdaBoostClassifier
  • [ ] NearestCentroid
  • [ ] TheilSenRegressor
  • [ ] GaussianProcess
  • [ ] SVC
  • [ ] PLSCanonical
  • [ ] SelectFromModel
  • [ ] CountVectorizer
  • [X] StandardScaler
  • [X] GaussianNB
  • [ ] FeatureUnion
  • [ ] DummyClassifier
  • [ ] GaussianRandomProjection
  • [ ] KNeighborsRegressor
  • [ ] MultiTaskElasticNetCV
  • [ ] ForestClassifier
  • [X] ExtraTreesClassifier
  • [ ] SparseRandomProjection
  • [ ] GraphLasso
  • [X] FeatureAgglomeration
  • [ ] EllipticEnvelope
  • [ ] OneHotEncoder
  • [ ] NuSVR
  • [ ] RadiusNeighborsClassifier
  • [ ] RidgeCV
  • [ ] MeanShift
  • [ ] LatentDirichletAllocation
  • [X] Binarizer
  • [ ] KMeans
  • [ ] LarsCV
  • [ ] SGDClassifier
  • [ ] BaggingClassifier
  • [ ] AdaBoostRegressor
  • [ ] RidgeClassifierCV
  • [ ] TruncatedSVD
  • [ ] Isomap
  • [ ] PCA
  • [ ] SkewedChi2Sampler
  • [ ] ExtraTreesRegressor
  • [ ] RandomizedSearchCV
  • [ ] Lasso
  • [ ] IncrementalPCA
  • [ ] LassoLars
  • [X] PolynomialFeatures
  • [ ] KernelDensity
  • [ ] PassiveAggressiveRegressor
  • [ ] SpectralCoclustering
  • [ ] QuantileEstimator
  • [ ] SpectralClustering
  • [X] MultinomialNB
  • [ ] LinearRegression
  • [ ] Pipeline
  • [X] SelectFwe
  • [ ] GMM
  • [ ] MeanEstimator
  • [X] Nystroem
  • [X] RandomForestClassifier
  • [ ] DummyRegressor
  • [ ] Perceptron
  • [ ] CCA
  • [X] DecisionTreeClassifier
  • [ ] ExtraTreeRegressor
  • [X] RBFSampler
  • [ ] BayesianRidge
  • [ ] RANSACRegressor
  • [ ] DictionaryLearning
  • [ ] TSNE
  • [ ] HashingVectorizer
  • [ ] RidgeClassifier
  • [ ] ForestRegressor
  • [ ] DecisionTreeRegressor
  • [ ] DBSCAN
  • [ ] OutputCodeClassifier
  • [X] KNeighborsClassifier
  • [X] ZeroCount
  • [ ] GradientBoostingRegressor
  • [ ] ARDRegression
  • [ ] LinearDiscriminantAnalysis
  • [ ] CalibratedClassifierCV
  • [X] SelectPercentile
  • [ ] OAS
  • [ ] DPGMM
  • [ ] SelectFpr
  • [ ] DictVectorizer
  • [X] BernoulliNB
  • [ ] Normalizer
  • [X] LinearSVC
  • [X] VarianceThreshold
  • [ ] MiniBatchKMeans
  • [ ] RandomForestRegressor
  • [ ] MultiTaskLasso
  • [ ] ProjectedGradientNMF
  • [ ] RandomTreesEmbedding
  • [ ] OneClassSVM
  • [ ] SGDRegressor
  • [ ] GaussianRandomProjectionHash
  • [ ] SparseCoder
  • [ ] MultiTaskLassoCV
  • [ ] QuadraticDiscriminantAnalysis
  • [ ] Birch
  • [ ] RadiusNeighborsRegressor
  • [ ] ExtraTreeClassifier
  • [ ] MinCovDet
  • [ ] LinearModelCV
  • [X] MinMaxScaler
  • [ ] NMF
  • [ ] TfidfTransformer
  • [ ] SelectFdr
  • [ ] KernelPCA
  • [ ] BaggingRegressor
  • [ ] Lars
  • [ ] SpectralEmbedding
  • [X] RobustScaler
  • [ ] BernoulliRBM
  • [ ] ShrunkCovariance
  • [ ] OrthogonalMatchingPursuit
  • [ ] NewBase
  • [ ] GenericUnivariateSelect
  • [X] RFE
  • [ ] LassoLarsCV
  • [ ] LocallyLinearEmbedding
  • [ ] PLSRegression
  • [ ] EmpiricalCovariance
  • [ ] LogisticRegressionCV
  • [ ] MDS

All 6 comments

Good idea. I've added this to the enhancements list.

This is our current coverage:

  • [ ] LabelEncoder
  • [ ] NearestNeighbors
  • [ ] OneVsOneClassifier
  • [ ] OneVsRestClassifier
  • [ ] AdditiveChi2Sampler
  • [X] LogisticRegression
  • [ ] PriorProbabilityEstimator
  • [ ] NuSVC
  • [ ] LinearSVR
  • [ ] NeighborsBase
  • [ ] LabelSpreading
  • [ ] LinearModel
  • [ ] AffinityPropagation
  • [ ] FunctionTransformer
  • [ ] ElasticNetCV
  • [ ] SpectralBiclustering
  • [ ] VBGMM
  • [ ] RandomizedLasso
  • [ ] LassoCV
  • [ ] PatchExtractor
  • [ ] GridSearchCV
  • [ ] PLSSVD
  • [ ] MiniBatchSparsePCA
  • [ ] ZeroEstimator
  • [ ] LSHForest
  • [ ] MultiLabelBinarizer
  • [ ] FeatureHasher
  • [X] FastICA
  • [ ] Ridge
  • [ ] LedoitWolf
  • [ ] MiniBatchDictionaryLearning
  • [ ] ElasticNet
  • [ ] Imputer
  • [ ] KernelRidge
  • [x] RandomizedPCA
  • [X] SelectKBest
  • [ ] MultiTaskElasticNet
  • [ ] LassoLarsIC
  • [ ] GraphLassoCV
  • [ ] LabelBinarizer
  • [X] MaxAbsScaler
  • [ ] VotingClassifier
  • [ ] IsotonicRegression
  • [ ] FactorAnalysis
  • [ ] LabelPropagation
  • [X] GradientBoostingClassifier
  • [ ] LogOddsEstimator
  • [ ] SparsePCA
  • [ ] TfidfVectorizer
  • [X] PassiveAggressiveClassifier
  • [ ] RFECV
  • [ ] RandomizedLogisticRegression
  • [ ] AgglomerativeClustering
  • [ ] ScaledLogOddsEstimator
  • [ ] OrthogonalMatchingPursuitCV
  • [ ] SVR
  • [ ] KernelCenterer
  • [X] AdaBoostClassifier
  • [ ] NearestCentroid
  • [ ] TheilSenRegressor
  • [ ] GaussianProcess
  • [ ] SVC
  • [ ] PLSCanonical
  • [ ] SelectFromModel
  • [ ] CountVectorizer
  • [X] StandardScaler
  • [X] GaussianNB
  • [ ] FeatureUnion
  • [ ] DummyClassifier
  • [ ] GaussianRandomProjection
  • [ ] KNeighborsRegressor
  • [ ] MultiTaskElasticNetCV
  • [ ] ForestClassifier
  • [X] ExtraTreesClassifier
  • [ ] SparseRandomProjection
  • [ ] GraphLasso
  • [X] FeatureAgglomeration
  • [ ] EllipticEnvelope
  • [ ] OneHotEncoder
  • [ ] NuSVR
  • [ ] RadiusNeighborsClassifier
  • [ ] RidgeCV
  • [ ] MeanShift
  • [ ] LatentDirichletAllocation
  • [X] Binarizer
  • [ ] KMeans
  • [ ] LarsCV
  • [ ] SGDClassifier
  • [ ] BaggingClassifier
  • [ ] AdaBoostRegressor
  • [ ] RidgeClassifierCV
  • [ ] TruncatedSVD
  • [ ] Isomap
  • [ ] PCA
  • [ ] SkewedChi2Sampler
  • [ ] ExtraTreesRegressor
  • [ ] RandomizedSearchCV
  • [ ] Lasso
  • [ ] IncrementalPCA
  • [ ] LassoLars
  • [X] PolynomialFeatures
  • [ ] KernelDensity
  • [ ] PassiveAggressiveRegressor
  • [ ] SpectralCoclustering
  • [ ] QuantileEstimator
  • [ ] SpectralClustering
  • [X] MultinomialNB
  • [ ] LinearRegression
  • [ ] Pipeline
  • [X] SelectFwe
  • [ ] GMM
  • [ ] MeanEstimator
  • [X] Nystroem
  • [X] RandomForestClassifier
  • [ ] DummyRegressor
  • [ ] Perceptron
  • [ ] CCA
  • [X] DecisionTreeClassifier
  • [ ] ExtraTreeRegressor
  • [X] RBFSampler
  • [ ] BayesianRidge
  • [ ] RANSACRegressor
  • [ ] DictionaryLearning
  • [ ] TSNE
  • [ ] HashingVectorizer
  • [ ] RidgeClassifier
  • [ ] ForestRegressor
  • [ ] DecisionTreeRegressor
  • [ ] DBSCAN
  • [ ] OutputCodeClassifier
  • [X] KNeighborsClassifier
  • [X] ZeroCount
  • [ ] GradientBoostingRegressor
  • [ ] ARDRegression
  • [ ] LinearDiscriminantAnalysis
  • [ ] CalibratedClassifierCV
  • [X] SelectPercentile
  • [ ] OAS
  • [ ] DPGMM
  • [ ] SelectFpr
  • [ ] DictVectorizer
  • [X] BernoulliNB
  • [ ] Normalizer
  • [X] LinearSVC
  • [X] VarianceThreshold
  • [ ] MiniBatchKMeans
  • [ ] RandomForestRegressor
  • [ ] MultiTaskLasso
  • [ ] ProjectedGradientNMF
  • [ ] RandomTreesEmbedding
  • [ ] OneClassSVM
  • [ ] SGDRegressor
  • [ ] GaussianRandomProjectionHash
  • [ ] SparseCoder
  • [ ] MultiTaskLassoCV
  • [ ] QuadraticDiscriminantAnalysis
  • [ ] Birch
  • [ ] RadiusNeighborsRegressor
  • [ ] ExtraTreeClassifier
  • [ ] MinCovDet
  • [ ] LinearModelCV
  • [X] MinMaxScaler
  • [ ] NMF
  • [ ] TfidfTransformer
  • [ ] SelectFdr
  • [ ] KernelPCA
  • [ ] BaggingRegressor
  • [ ] Lars
  • [ ] SpectralEmbedding
  • [X] RobustScaler
  • [ ] BernoulliRBM
  • [ ] ShrunkCovariance
  • [ ] OrthogonalMatchingPursuit
  • [ ] NewBase
  • [ ] GenericUnivariateSelect
  • [X] RFE
  • [ ] LassoLarsCV
  • [ ] LocallyLinearEmbedding
  • [ ] PLSRegression
  • [ ] EmpiricalCovariance
  • [ ] LogisticRegressionCV
  • [ ] MDS

@teaearlgraycold for someone coming brand new into the project, is there a good example of what needs to be done for the list you just posted?

@westonplatter - here's the code I used

from sklearn import * # Needed to discover all subclasses
import tpot
import sklearn

def all_subclasses(cls):
    return cls.__subclasses__() + [g for s in cls.__subclasses__() for g in all_subclasses(s)]

tpot_estimators = set([x.__name__ for x in tpot.operators.Operator.inheritors()])
sklearn_estimators = set([x.__name__ for x in all_subclasses(sklearn.base.BaseEstimator)
    if x.__name__[0] != '_' and not x.__name__.startswith("Base")])

for est in sklearn_estimators:
    marker = 'X' if est in tpot_estimators else ' '
    print("- [{}] {}".format(marker, est))

@teaearlgraycold thanks for the example. I'll take a look.

@westonplatter, that code requires you use the version of TPOT that is currently in the development branch (0.5).

Was this page helpful?
0 / 5 - 0 ratings