TPOT is built on top of several existing Python libraries, including:
Most of the necessary Python packages can be installed via the Anaconda Python distribution, which we strongly recommend that you use. Support for Python 3.4 and below has been officially dropped since version 0.11.0.
You can install TPOT using
NumPy, SciPy, scikit-learn, pandas, joblib, and PyTorch can be installed in Anaconda via the command:
conda install numpy scipy scikit-learn pandas joblib pytorch
DEAP, update_checker, tqdm, stopit and xgboost can be installed with
pip via the command:
pip install deap update_checker tqdm stopit xgboost
Windows users: pip installation may not work on some Windows environments, and it may cause unexpected errors. If you have issues installing XGBoost, check the XGBoost installation documentation.
pip install dask[delayed] dask[dataframe] dask-ml fsspec>=0.3.3 distributed>=2.10.0
pip install scikit-mdr skrebate
To enable support for PyTorch-based neural networks (TPOT-NN), you will need to install PyTorch. TPOT-NN will work with either CPU or GPU PyTorch, but we strongly recommend using a GPU version, if possible, as CPU PyTorch models tend to train very slowly.
We recommend following PyTorch's installation instructions customized for your operating system and Python distribution.
Finally to install TPOT itself, run the following command:
pip install tpot
To install tpot and its core dependencies you can use:
conda install -c conda-forge tpot
To install additional dependencies you can use:
conda install -c conda-forge tpot xgboost dask dask-ml scikit-mdr skrebate
Installation for using TPOT-cuML configuration
With "TPOT cuML" configuration (see built-in configurations), TPOT will search over a restricted configuration using the GPU-accelerated estimators in RAPIDS cuML and DMLC XGBoost. This configuration requires an NVIDIA Pascal architecture or better GPU with compute capability 6.0+, and that the library cuML is installed. With this configuration, all model training and predicting will be GPU-accelerated. This configuration is particularly useful for medium-sized and larger datasets on which CPU-based estimators are a common bottleneck, and works for both the
Please download this conda environment yml file to install TPOT for using TPOT-cuML configuration.
conda env create -f tpot-cuml.yml -n tpot-cuml conda activate tpot-cuml
Please file a new issue if you run into installation problems.