Skip to content

Get configspace

This file is part of the TPOT library.

The current version of TPOT was developed at Cedars-Sinai by: - Pedro Henrique Ribeiro (https://github.com/perib, https://www.linkedin.com/in/pedro-ribeiro/) - Anil Saini (anil.saini@cshs.org) - Jose Hernandez (jgh9094@gmail.com) - Jay Moran (jay.moran@cshs.org) - Nicholas Matsumoto (nicholas.matsumoto@cshs.org) - Hyunjun Choi (hyunjun.choi@cshs.org) - Gabriel Ketron (gabriel.ketron@cshs.org) - Miguel E. Hernandez (miguel.e.hernandez@cshs.org) - Jason Moore (moorejh28@gmail.com)

The original version of TPOT was primarily developed at the University of Pennsylvania by: - Randal S. Olson (rso@randalolson.com) - Weixuan Fu (weixuanf@upenn.edu) - Daniel Angell (dpa34@drexel.edu) - Jason Moore (moorejh28@gmail.com) - and many more generous open-source contributors

TPOT is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

TPOT is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with TPOT. If not, see http://www.gnu.org/licenses/.

get_configspace(name, n_classes=3, n_samples=1000, n_features=100, random_state=None, n_jobs=1)

This function returns the ConfigSpace.ConfigurationSpace with the hyperparameter ranges for the given scikit-learn method. It also uses the n_classes, n_samples, n_features, and random_state to set the hyperparameters that depend on these values.

Parameters:

Name Type Description Default
name str

The str name of the scikit-learn method for which to create the ConfigurationSpace. (e.g. 'RandomForestClassifier' for sklearn.ensemble.RandomForestClassifier)

required
n_classes int

The number of classes in the target variable. Default is 3.

3
n_samples int

The number of samples in the dataset. Default is 1000.

1000
n_features int

The number of features in the dataset. Default is 100.

100
random_state int

The random_state to use in the ConfigurationSpace. Default is None. If None, the random_state hyperparameter is not included in the ConfigurationSpace. Use this to set the random state for the individual methods if you want to ensure reproducibility.

None
n_jobs int(default=1)

Sets the n_jobs parameter for estimators that have it. Default is 1.

1
Source code in tpot/config/get_configspace.py
def get_configspace(name, n_classes=3, n_samples=1000, n_features=100, random_state=None, n_jobs=1):
    """
    This function returns the ConfigSpace.ConfigurationSpace with the hyperparameter ranges for the given
    scikit-learn method. It also uses the n_classes, n_samples, n_features, and random_state to set the
    hyperparameters that depend on these values.

    Parameters
    ----------
    name : str
        The str name of the scikit-learn method for which to create the ConfigurationSpace. (e.g. 'RandomForestClassifier' for sklearn.ensemble.RandomForestClassifier)
    n_classes : int
        The number of classes in the target variable. Default is 3.
    n_samples : int
        The number of samples in the dataset. Default is 1000.
    n_features : int
        The number of features in the dataset. Default is 100.
    random_state : int
        The random_state to use in the ConfigurationSpace. Default is None.
        If None, the random_state hyperparameter is not included in the ConfigurationSpace.
        Use this to set the random state for the individual methods if you want to ensure reproducibility.
    n_jobs : int (default=1)
        Sets the n_jobs parameter for estimators that have it. Default is 1.

    """
    match name:

        #autoqtl_builtins.py
        case "FeatureEncodingFrequencySelector":
            return autoqtl_builtins.FeatureEncodingFrequencySelector_ConfigurationSpace
        case "DominantEncoder":
            return {}
        case "RecessiveEncoder":
            return {}
        case "HeterosisEncoder":
            return {}
        case "UnderDominanceEncoder":
            return {}
        case "OverDominanceEncoder":
            return {}

        case "Passthrough":
            return {}
        case "SkipTransformer":
            return {}

        #classifiers.py
        case "LinearDiscriminantAnalysis":
            return classifiers.get_LinearDiscriminantAnalysis_ConfigurationSpace()
        case "AdaBoostClassifier":
            return classifiers.get_AdaBoostClassifier_ConfigurationSpace(random_state=random_state)
        case "LogisticRegression":
            return classifiers.get_LogisticRegression_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "KNeighborsClassifier":
            return classifiers.get_KNeighborsClassifier_ConfigurationSpace(n_samples=n_samples, n_jobs=n_jobs)
        case "DecisionTreeClassifier":
            return classifiers.get_DecisionTreeClassifier_ConfigurationSpace(n_featues=n_features, random_state=random_state)
        case "SVC":
            return classifiers.get_SVC_ConfigurationSpace(random_state=random_state)
        case "LinearSVC":
            return classifiers.get_LinearSVC_ConfigurationSpace(random_state=random_state)
        case "RandomForestClassifier":
            return classifiers.get_RandomForestClassifier_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "GradientBoostingClassifier":
            return classifiers.get_GradientBoostingClassifier_ConfigurationSpace(n_classes=n_classes, random_state=random_state)
        case "HistGradientBoostingClassifier":
            return classifiers.get_HistGradientBoostingClassifier_ConfigurationSpace(random_state=random_state)
        case "XGBClassifier":
            return classifiers.get_XGBClassifier_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "LGBMClassifier":
            return classifiers.get_LGBMClassifier_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "ExtraTreesClassifier":
            return classifiers.get_ExtraTreesClassifier_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "SGDClassifier":
            return classifiers.get_SGDClassifier_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "MLPClassifier":
            return classifiers.get_MLPClassifier_ConfigurationSpace(random_state=random_state)
        case "BernoulliNB":
            return classifiers.get_BernoulliNB_ConfigurationSpace()
        case "MultinomialNB":
            return classifiers.get_MultinomialNB_ConfigurationSpace()
        case "GaussianNB":
            return {}
        case "LassoLarsCV":
            return {}
        case "ElasticNetCV":
            return regressors.ElasticNetCV_configspace
        case "RidgeCV":
            return {}
        case "PassiveAggressiveClassifier":
            return classifiers.get_PassiveAggressiveClassifier_ConfigurationSpace(random_state=random_state)
        case "QuadraticDiscriminantAnalysis":
            return classifiers.get_QuadraticDiscriminantAnalysis_ConfigurationSpace()
        case "GaussianProcessClassifier":
            return classifiers.get_GaussianProcessClassifier_ConfigurationSpace(n_features=n_features, random_state=random_state)
        case "BaggingClassifier":
            return classifiers.get_BaggingClassifier_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)

        #regressors.py
        case "RandomForestRegressor":
            return regressors.get_RandomForestRegressor_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "SGDRegressor":
            return regressors.get_SGDRegressor_ConfigurationSpace(random_state=random_state)
        case "Ridge":
            return regressors.get_Ridge_ConfigurationSpace(random_state=random_state)
        case "Lasso":
            return regressors.get_Lasso_ConfigurationSpace(random_state=random_state)
        case "ElasticNet":
            return regressors.get_ElasticNet_ConfigurationSpace(random_state=random_state)
        case "Lars":
            return regressors.get_Lars_ConfigurationSpace(random_state=random_state)
        case "OthogonalMatchingPursuit":
            return regressors.get_OthogonalMatchingPursuit_ConfigurationSpace()
        case "BayesianRidge":
            return regressors.get_BayesianRidge_ConfigurationSpace()
        case "LassoLars":
            return regressors.get_LassoLars_ConfigurationSpace(random_state=random_state)
        case "BaggingRegressor":
            return regressors.get_BaggingRegressor_ConfigurationSpace(random_state=random_state)
        case "ARDRegression":
            return regressors.get_ARDRegression_ConfigurationSpace()
        case "TheilSenRegressor":
            return regressors.get_TheilSenRegressor_ConfigurationSpace(random_state=random_state)
        case "Perceptron":
            return regressors.get_Perceptron_ConfigurationSpace(random_state=random_state)
        case "DecisionTreeRegressor":
            return regressors.get_DecisionTreeRegressor_ConfigurationSpace(random_state=random_state)
        case "LinearSVR":
            return regressors.get_LinearSVR_ConfigurationSpace(random_state=random_state)
        case "SVR":
            return regressors.get_SVR_ConfigurationSpace()
        case "XGBRegressor":
            return regressors.get_XGBRegressor_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "AdaBoostRegressor":
            return regressors.get_AdaBoostRegressor_ConfigurationSpace(random_state=random_state)
        case "ExtraTreesRegressor":
            return regressors.get_ExtraTreesRegressor_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "GradientBoostingRegressor":
            return regressors.get_GradientBoostingRegressor_ConfigurationSpace(random_state=random_state)
        case "HistGradientBoostingRegressor":
            return regressors.get_HistGradientBoostingRegressor_ConfigurationSpace(random_state=random_state)
        case "MLPRegressor":
            return regressors.get_MLPRegressor_ConfigurationSpace(random_state=random_state)
        case "KNeighborsRegressor":
            return regressors.get_KNeighborsRegressor_ConfigurationSpace(n_samples=n_samples, n_jobs=n_jobs)
        case "GaussianProcessRegressor":
            return regressors.get_GaussianProcessRegressor_ConfigurationSpace(n_features=n_features, random_state=random_state)
        case "LGBMRegressor":
            return regressors.get_LGBMRegressor_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "BaggingRegressor":
            return regressors.get_BaggingRegressor_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)

        #transformers.py
        case "Binarizer":
            return transformers.Binarizer_configspace
        case "Normalizer":
            return transformers.Normalizer_configspace
        case "PCA":
            return transformers.PCA_configspace
        case "ZeroCount":
            return transformers.ZeroCount_configspace
        case "FastICA":
            return transformers.get_FastICA_configspace(n_features=n_features, random_state=random_state)
        case "FeatureAgglomeration":
            return transformers.get_FeatureAgglomeration_configspace(n_features=n_features)
        case "Nystroem":
            return transformers.get_Nystroem_configspace(n_features=n_features, random_state=random_state)
        case "RBFSampler":
            return transformers.get_RBFSampler_configspace(n_features=n_features, random_state=random_state)
        case "MinMaxScaler":
            return {}
        case "PowerTransformer":
            return {}
        case "QuantileTransformer":
            return transformers.get_QuantileTransformer_configspace(n_samples=n_samples, random_state=random_state)
        case "RobustScaler":
            return transformers.RobustScaler_configspace
        case "MaxAbsScaler":
            return {}
        case "PolynomialFeatures":
            return transformers.PolynomialFeatures_configspace
        case "StandardScaler":
            return {}
        case "PassKBinsDiscretizer":
            return transformers.get_passkbinsdiscretizer_configspace(random_state=random_state)
        case "KBinsDiscretizer":
            return transformers.get_passkbinsdiscretizer_configspace(random_state=random_state)
        case "ColumnOneHotEncoder":
            return {}
        case "ColumnOrdinalEncoder":
            return {}

        #selectors.py
        case "SelectFwe":
            return selectors.SelectFwe_configspace 
        case "SelectPercentile":
            return selectors.SelectPercentile_configspace
        case "VarianceThreshold":
            return selectors.VarianceThreshold_configspace
        case "RFE":
            return selectors.RFE_configspace_part
        case "SelectFromModel":
            return selectors.SelectFromModel_configspace_part


        #special_configs.py
        case "AddTransformer":
            return {}
        case "mul_neg_1_Transformer":
            return {}
        case "MulTransformer":
            return {}
        case "SafeReciprocalTransformer":
            return {}
        case "EQTransformer":
            return {}
        case "NETransformer":
            return {}
        case "GETransformer":
            return {}
        case "GTTransformer":
            return {}
        case "LETransformer":
            return {}
        case "LTTransformer":
            return {}        
        case "MinTransformer":
            return {}
        case "MaxTransformer":
            return {}
        case "ZeroTransformer":
            return {}
        case "OneTransformer":
            return {}
        case "NTransformer":
            return ConfigurationSpace(

                space = {

                    'n': Float("n", bounds=(-1e2, 1e2)),
                }
            ) 

        #imputers.py
        case "SimpleImputer":
            return imputers.simple_imputer_cs
        case "IterativeImputer":
            return imputers.get_IterativeImputer_config_space(n_features=n_features, random_state=random_state)
        case "IterativeImputer_no_estimator":
            return imputers.get_IterativeImputer_config_space_no_estimator(n_features=n_features, random_state=random_state)

        case "KNNImputer":
            return imputers.get_KNNImputer_config_space(n_samples=n_samples)

        #mdr_configs.py
        case "MDR":
            return mdr_configs.MDR_configspace
        case "ContinuousMDR":
            return mdr_configs.MDR_configspace
        case "ReliefF":
            return mdr_configs.get_skrebate_ReliefF_config_space(n_features=n_features)
        case "SURF":
            return mdr_configs.get_skrebate_SURF_config_space(n_features=n_features)
        case "SURFstar":
            return mdr_configs.get_skrebate_SURFstar_config_space(n_features=n_features)
        case "MultiSURF":
            return mdr_configs.get_skrebate_MultiSURF_config_space(n_features=n_features)

        #classifiers_sklearnex.py
        case "RandomForestClassifier_sklearnex":
            return classifiers_sklearnex.get_RandomForestClassifier_ConfigurationSpace(random_state=random_state, n_jobs=n_jobs)
        case "LogisticRegression_sklearnex":
            return classifiers_sklearnex.get_LogisticRegression_ConfigurationSpace(random_state=random_state)
        case "KNeighborsClassifier_sklearnex":
            return classifiers_sklearnex.get_KNeighborsClassifier_ConfigurationSpace(n_samples=n_samples)
        case "SVC_sklearnex":
            return classifiers_sklearnex.get_SVC_ConfigurationSpace(random_state=random_state)
        case "NuSVC_sklearnex":
            return classifiers_sklearnex.get_NuSVC_ConfigurationSpace(random_state=random_state)

        #regressors_sklearnex.py
        case "LinearRegression_sklearnex":
            return {}
        case "Ridge_sklearnex":
            return regressors_sklearnex.get_Ridge_ConfigurationSpace(random_state=random_state)
        case "Lasso_sklearnex":
            return regressors_sklearnex.get_Lasso_ConfigurationSpace(random_state=random_state)
        case "ElasticNet_sklearnex":
            return regressors_sklearnex.get_ElasticNet_ConfigurationSpace(random_state=random_state)
        case "SVR_sklearnex":
            return regressors_sklearnex.get_SVR_ConfigurationSpace(random_state=random_state)
        case "NuSVR_sklearnex":
            return regressors_sklearnex.get_NuSVR_ConfigurationSpace(random_state=random_state)
        case "RandomForestRegressor_sklearnex":
            return regressors_sklearnex.get_RandomForestRegressor_ConfigurationSpace(random_state=random_state)
        case "KNeighborsRegressor_sklearnex":
            return regressors_sklearnex.get_KNeighborsRegressor_ConfigurationSpace(n_samples=n_samples)

    #raise error
    raise ValueError(f"Could not find configspace for {name}")

get_node(name, n_classes=3, n_samples=100, n_features=100, random_state=None, base_node=EstimatorNode, n_jobs=1)

Helper function for get_search_space. Returns a single EstimatorNode for the given scikit-learn method. Also includes special cases for nodes that require custom parsing of the hyperparameters or methods that wrap other methods.

Parameters:

Name Type Description Default
name str or list

The name of the scikit-learn method or group of methods for which to create the search space. - str: The name of the scikit-learn method. (e.g. 'RandomForestClassifier' for sklearn.ensemble.RandomForestClassifier) Alternatively, the name of a group of methods. (e.g. 'classifiers' for all classifiers). - list: A list of scikit-learn method names. (e.g. ['RandomForestClassifier', 'ExtraTreesClassifier'])

required
n_classes int(default=3)

The number of classes in the target variable.

3
n_samples int(default=1000)

The number of samples in the dataset.

100
n_features int(default=100)

The number of features in the dataset.

100
random_state int(default=None)

A fixed random_state to pass through to all methods that have a random_state hyperparameter.

None
return_choice_pipeline bool(default=True)

If False, returns a list of TPOT.search_spaces.nodes.EstimatorNode objects. If True, returns a single TPOT.search_spaces.pipelines.ChoicePipeline that includes and samples from all EstimatorNodes.

required
base_node

The SearchSpace to pass the configuration space to. If you want to experiment with custom mutation/crossover operators, you can pass a custom SearchSpace node here.

EstimatorNode
n_jobs int(default=1)

Sets the n_jobs parameter for estimators that have it. Default is 1.

1

Returns:

Type Description
Returns an SearchSpace object that can be optimized by TPOT.
  • TPOT.search_spaces.nodes.EstimatorNode (or base_node).
  • TPOT.search_spaces.pipelines.WrapperPipeline object if the method requires a wrapped estimator.
Source code in tpot/config/get_configspace.py
def get_node(name, n_classes=3, n_samples=100, n_features=100, random_state=None, base_node=EstimatorNode, n_jobs=1):
    """
    Helper function for get_search_space. Returns a single EstimatorNode for the given scikit-learn method. Also includes special cases for nodes that require custom parsing of the hyperparameters or methods that wrap other methods.

    Parameters
    ----------

    name : str or list
        The name of the scikit-learn method or group of methods for which to create the search space.
        - str: The name of the scikit-learn method. (e.g. 'RandomForestClassifier' for sklearn.ensemble.RandomForestClassifier)
        Alternatively, the name of a group of methods. (e.g. 'classifiers' for all classifiers).
        - list: A list of scikit-learn method names. (e.g. ['RandomForestClassifier', 'ExtraTreesClassifier'])
    n_classes : int (default=3)
        The number of classes in the target variable.
    n_samples : int (default=1000)
        The number of samples in the dataset.
    n_features : int (default=100)
        The number of features in the dataset.
    random_state : int (default=None)
        A fixed random_state to pass through to all methods that have a random_state hyperparameter. 
    return_choice_pipeline : bool (default=True)
        If False, returns a list of TPOT.search_spaces.nodes.EstimatorNode objects.
        If True, returns a single TPOT.search_spaces.pipelines.ChoicePipeline that includes and samples from all EstimatorNodes.
    base_node: TPOT.search_spaces.base.SearchSpace (default=TPOT.search_spaces.nodes.EstimatorNode)
        The SearchSpace to pass the configuration space to. If you want to experiment with custom mutation/crossover operators, you can pass a custom SearchSpace node here.
    n_jobs : int (default=1)
        Sets the n_jobs parameter for estimators that have it. Default is 1.

    Returns
    -------
        Returns an SearchSpace object that can be optimized by TPOT.
        - TPOT.search_spaces.nodes.EstimatorNode (or base_node).
        - TPOT.search_spaces.pipelines.WrapperPipeline object if the method requires a wrapped estimator.


    """

    if name == "LinearSVC_wrapped":
        ext = get_node("LinearSVC", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return WrapperPipeline(estimator_search_space=ext, method=sklearn.calibration.CalibratedClassifierCV, space={})
    if name == "RFE_classification":
        rfe_sp = get_configspace(name="RFE", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        ext = get_node("ExtraTreesClassifier", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return WrapperPipeline(estimator_search_space=ext, method=RFE, space=rfe_sp)
    if name == "RFE_regression":
        rfe_sp = get_configspace(name="RFE", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        ext = get_node("ExtraTreesRegressor", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return WrapperPipeline(estimator_search_space=ext, method=RFE, space=rfe_sp)
    if name == "SelectFromModel_classification":
        sfm_sp = get_configspace(name="SelectFromModel", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        ext = get_node("ExtraTreesClassifier", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return WrapperPipeline(estimator_search_space=ext, method=SelectFromModel, space=sfm_sp)
    if name == "SelectFromModel_regression":
        sfm_sp = get_configspace(name="SelectFromModel", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        ext = get_node("ExtraTreesRegressor", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return WrapperPipeline(estimator_search_space=ext, method=SelectFromModel, space=sfm_sp)
    # TODO Add IterativeImputer with more estimator methods
    if name == "IterativeImputer_learned_estimators":
        iteative_sp = get_configspace(name="IterativeImputer_no_estimator", n_features=n_features, random_state=random_state, n_jobs=n_jobs)
        regressor_searchspace = get_node("ExtraTreesRegressor", n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return WrapperPipeline(estimator_search_space=regressor_searchspace, method=IterativeImputer, space=iteative_sp)

    #these are nodes that have special search spaces which require custom parsing of the hyperparameters
    if name == "IterativeImputer":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return EstimatorNode(STRING_TO_CLASS[name], configspace, hyperparameter_parser=imputers.IterativeImputer_hyperparameter_parser)
    if name == "RobustScaler":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=transformers.robust_scaler_hyperparameter_parser)
    if name == "GradientBoostingClassifier":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=classifiers.GradientBoostingClassifier_hyperparameter_parser)
    if name == "HistGradientBoostingClassifier":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=classifiers.HistGradientBoostingClassifier_hyperparameter_parser)
    if name == "GradientBoostingRegressor":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=regressors.GradientBoostingRegressor_hyperparameter_parser)
    if  name == "HistGradientBoostingRegressor":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=regressors.HistGradientBoostingRegressor_hyperparameter_parser)
    if name == "MLPClassifier":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=classifiers.MLPClassifier_hyperparameter_parser)
    if name == "MLPRegressor":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=regressors.MLPRegressor_hyperparameter_parser)
    if name == "GaussianProcessRegressor":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=regressors.GaussianProcessRegressor_hyperparameter_parser)
    if name == "GaussianProcessClassifier":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=classifiers.GaussianProcessClassifier_hyperparameter_parser)
    if name == "FeatureAgglomeration":
        configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, random_state=random_state, n_jobs=n_jobs)
        return base_node(STRING_TO_CLASS[name], configspace, hyperparameter_parser=transformers.FeatureAgglomeration_hyperparameter_parser)

    configspace = get_configspace(name, n_classes=n_classes, n_samples=n_samples, n_features=n_features, random_state=random_state, n_jobs=n_jobs)
    if configspace is None:
        #raise warning
        warnings.warn(f"Could not find configspace for {name}")
        return None

    return base_node(STRING_TO_CLASS[name], configspace)

get_search_space(name, n_classes=3, n_samples=1000, n_features=100, random_state=None, return_choice_pipeline=True, base_node=EstimatorNode, n_jobs=1)

Returns a TPOT search space for a given scikit-learn method or group of methods.

Parameters:

Name Type Description Default
name str or list

The name of the scikit-learn method or group of methods for which to create the search space. - str: The name of the scikit-learn method. (e.g. 'RandomForestClassifier' for sklearn.ensemble.RandomForestClassifier) Alternatively, the name of a group of methods. (e.g. 'classifiers' for all classifiers). - list: A list of scikit-learn method names. (e.g. ['RandomForestClassifier', 'ExtraTreesClassifier'])

required
n_classes int(default=3)

The number of classes in the target variable.

3
n_samples int(default=1000)

The number of samples in the dataset.

1000
n_features int(default=100)

The number of features in the dataset.

100
random_state int(default=None)

A fixed random_state to pass through to all methods that have a random_state hyperparameter.

None
return_choice_pipeline bool(default=True)

If False, returns a list of TPOT.search_spaces.nodes.EstimatorNode objects. If True, returns a single TPOT.search_spaces.pipelines.ChoicePipeline that includes and samples from all EstimatorNodes.

True
base_node

The SearchSpace to pass the configuration space to. If you want to experiment with custom mutation/crossover operators, you can pass a custom SearchSpace node here.

EstimatorNode
n_jobs int(default=1)

Sets the n_jobs parameter for estimators that have it. Default is 1.

1

Returns:

Type Description
Returns an SearchSpace object that can be optimized by TPOT.
  • TPOT.search_spaces.nodes.EstimatorNode (or base_node) if there is only one search space.
  • List of TPOT.search_spaces.nodes.EstimatorNode (or base_node) objects if there are multiple search spaces.
  • TPOT.search_spaces.pipelines.ChoicePipeline object if return_choice_pipeline is True. Note: for some special cases with methods using wrapped estimators, the returned search space is a TPOT.search_spaces.pipelines.WrapperPipeline object.
Source code in tpot/config/get_configspace.py
def get_search_space(name, n_classes=3, n_samples=1000, n_features=100, random_state=None, return_choice_pipeline=True, base_node=EstimatorNode, n_jobs=1):
    """
    Returns a TPOT search space for a given scikit-learn method or group of methods.

    Parameters
    ----------
    name : str or list
        The name of the scikit-learn method or group of methods for which to create the search space.
        - str: The name of the scikit-learn method. (e.g. 'RandomForestClassifier' for sklearn.ensemble.RandomForestClassifier)
        Alternatively, the name of a group of methods. (e.g. 'classifiers' for all classifiers).
        - list: A list of scikit-learn method names. (e.g. ['RandomForestClassifier', 'ExtraTreesClassifier'])
    n_classes : int (default=3)
        The number of classes in the target variable.
    n_samples : int (default=1000)
        The number of samples in the dataset.
    n_features : int (default=100)
        The number of features in the dataset.
    random_state : int (default=None)
        A fixed random_state to pass through to all methods that have a random_state hyperparameter. 
    return_choice_pipeline : bool (default=True)
        If False, returns a list of TPOT.search_spaces.nodes.EstimatorNode objects.
        If True, returns a single TPOT.search_spaces.pipelines.ChoicePipeline that includes and samples from all EstimatorNodes.
    base_node: TPOT.search_spaces.base.SearchSpace (default=TPOT.search_spaces.nodes.EstimatorNode)
        The SearchSpace to pass the configuration space to. If you want to experiment with custom mutation/crossover operators, you can pass a custom SearchSpace node here.
    n_jobs : int (default=1)
        Sets the n_jobs parameter for estimators that have it. Default is 1.

    Returns
    -------
        Returns an SearchSpace object that can be optimized by TPOT.
        - TPOT.search_spaces.nodes.EstimatorNode (or base_node) if there is only one search space.
        - List of TPOT.search_spaces.nodes.EstimatorNode (or base_node) objects if there are multiple search spaces.
        - TPOT.search_spaces.pipelines.ChoicePipeline object if return_choice_pipeline is True.
        Note: for some special cases with methods using wrapped estimators, the returned search space is a TPOT.search_spaces.pipelines.WrapperPipeline object.

    """
    name = flatten_group_names(name)

    #if list of names, return a list of EstimatorNodes
    if isinstance(name, list) or isinstance(name, np.ndarray):
        search_spaces = [get_search_space(n, n_classes=n_classes, n_samples=n_samples, n_features=n_features, random_state=random_state, return_choice_pipeline=False, base_node=base_node, n_jobs=n_jobs) for n in name]
        #remove Nones
        search_spaces = [s for s in search_spaces if s is not None]

        if return_choice_pipeline:
            return ChoicePipeline(search_spaces=np.hstack(search_spaces))
        else:
            return np.hstack(search_spaces)

    # if name in GROUPNAMES:
    #     name_list = GROUPNAMES[name]
    #     return get_search_space(name_list, n_classes=n_classes, n_samples=n_samples, n_features=n_features, random_state=random_state, return_choice_pipeline=return_choice_pipeline, base_node=base_node)

    return get_node(name, n_classes=n_classes, n_samples=n_samples, n_features=n_features, random_state=random_state, base_node=base_node, n_jobs=n_jobs)