Installation
TPOT is built on top of several existing Python libraries, including:
Most of the necessary Python packages can be installed via the Anaconda Python distribution, which we strongly recommend that you use. Support for Python 3.4 and below has been officially dropped since version 0.11.0.
You can install TPOT using pip
or conda-forge
.
pip
NumPy, SciPy, scikit-learn, pandas, joblib, and PyTorch can be installed in Anaconda via the command:
DEAP, update_checker, tqdm, stopit and xgboost can be installed with pip
via the command:
Windows users: pip installation may not work on some Windows environments, and it may cause unexpected errors. If you have issues installing XGBoost, check the XGBoost installation documentation.
If you plan to use Dask for parallel training, make sure to install dask[delay] and dask[dataframe] and dask_ml. It is noted that dask-ml>=1.7 requires distributed>=2.4.0 and scikit-learn>=0.23.0.
If you plan to use the TPOT-MDR configuration, make sure to install scikit-mdr and scikit-rebate:
To enable support for PyTorch-based neural networks (TPOT-NN), you will need to install PyTorch. TPOT-NN will work with either CPU or GPU PyTorch, but we strongly recommend using a GPU version, if possible, as CPU PyTorch models tend to train very slowly.
We recommend following PyTorch's installation instructions customized for your operating system and Python distribution.
Finally to install TPOT itself, run the following command:
conda-forge
To install tpot and its core dependencies you can use:
To install additional dependencies you can use:
As mentioned above, we recommend following PyTorch's installation instructions for installing it to enable support for PyTorch-based neural networks (TPOT-NN).
Installation for using TPOT-cuML configuration
With "TPOT cuML" configuration (see built-in configurations), TPOT will search over a restricted configuration using the GPU-accelerated estimators in RAPIDS cuML and DMLC XGBoost. This configuration requires an NVIDIA Pascal architecture or better GPU with compute capability 6.0+, and that the library cuML is installed. With this configuration, all model training and predicting will be GPU-accelerated. This configuration is particularly useful for medium-sized and larger datasets on which CPU-based estimators are a common bottleneck, and works for both the TPOTClassifier
and TPOTRegressor
.
Please download this conda environment yml file to install TPOT for using TPOT-cuML configuration.
Installation problems
Please file a new issue if you run into installation problems.