Installation

TPOT is built on top of several existing Python libraries, including:

Most of the necessary Python packages can be installed via the Anaconda Python distribution, which we strongly recommend that you use. Support for Python 3.4 and below has been officially dropped since version 0.11.0.

You can install TPOT using pip or conda-forge.

pip

NumPy, SciPy, scikit-learn, pandas, joblib, and PyTorch can be installed in Anaconda via the command:

conda install numpy scipy scikit-learn pandas joblib pytorch

DEAP, update_checker, tqdm, stopit and xgboost can be installed with pip via the command:

pip install deap update_checker tqdm stopit xgboost

Windows users: pip installation may not work on some Windows environments, and it may cause unexpected errors. If you have issues installing XGBoost, check the XGBoost installation documentation.

If you plan to use Dask for parallel training, make sure to install dask[delay] and dask[dataframe] and dask_ml. It is noted that dask-ml>=1.7 requires distributed>=2.4.0 and scikit-learn>=0.23.0.

pip install dask[delayed] dask[dataframe] dask-ml fsspec>=0.3.3 distributed>=2.10.0

If you plan to use the TPOT-MDR configuration, make sure to install scikit-mdr and scikit-rebate:

pip install scikit-mdr skrebate

To enable support for PyTorch-based neural networks (TPOT-NN), you will need to install PyTorch. TPOT-NN will work with either CPU or GPU PyTorch, but we strongly recommend using a GPU version, if possible, as CPU PyTorch models tend to train very slowly.

We recommend following PyTorch's installation instructions customized for your operating system and Python distribution.

Finally to install TPOT itself, run the following command:

pip install tpot

conda-forge

To install tpot and its core dependencies you can use:

conda install -c conda-forge tpot

To install additional dependencies you can use:

conda install -c conda-forge tpot xgboost dask dask-ml scikit-mdr skrebate

As mentioned above, we recommend following PyTorch's installation instructions for installing it to enable support for PyTorch-based neural networks (TPOT-NN).

Installation for using TPOT-cuML configuration

With "TPOT cuML" configuration (see built-in configurations), TPOT will search over a restricted configuration using the GPU-accelerated estimators in RAPIDS cuML and DMLC XGBoost. This configuration requires an NVIDIA Pascal architecture or better GPU with compute capability 6.0+, and that the library cuML is installed. With this configuration, all model training and predicting will be GPU-accelerated. This configuration is particularly useful for medium-sized and larger datasets on which CPU-based estimators are a common bottleneck, and works for both the TPOTClassifier and TPOTRegressor.

Please download this conda environment yml file to install TPOT for using TPOT-cuML configuration.

conda env create -f tpot-cuml.yml -n tpot-cuml
conda activate tpot-cuml

Installation problems

Please file a new issue if you run into installation problems.