Passkbinsdiscretizer
This file is part of the TPOT library.
The current version of TPOT was developed at Cedars-Sinai by: - Pedro Henrique Ribeiro (https://github.com/perib, https://www.linkedin.com/in/pedro-ribeiro/) - Anil Saini (anil.saini@cshs.org) - Jose Hernandez (jgh9094@gmail.com) - Jay Moran (jay.moran@cshs.org) - Nicholas Matsumoto (nicholas.matsumoto@cshs.org) - Hyunjun Choi (hyunjun.choi@cshs.org) - Miguel E. Hernandez (miguel.e.hernandez@cshs.org) - Jason Moore (moorejh28@gmail.com)
The original version of TPOT was primarily developed at the University of Pennsylvania by: - Randal S. Olson (rso@randalolson.com) - Weixuan Fu (weixuanf@upenn.edu) - Daniel Angell (dpa34@drexel.edu) - Jason Moore (moorejh28@gmail.com) - and many more generous open-source contributors
TPOT is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
TPOT is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with TPOT. If not, see http://www.gnu.org/licenses/.
PassKBinsDiscretizer
¶
Bases: BaseEstimator
, TransformerMixin
Source code in tpot2/builtin_modules/passkbinsdiscretizer.py
random_state = random_state
instance-attribute
¶
Same as sklearn.preprocessing.KBinsDiscretizer, but passes through columns that are not discretized due to having fewer than n_bins unique values instead of ignoring them. See sklearn.preprocessing.KBinsDiscretizer for more information.
select_features(X, min_unique=10)
¶
Given a DataFrame or numpy array, return a list of column indices that have more than min_unique unique values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
Data to select features from |
required | |
min_unique |
Minimum number of unique values a column must have to be selected |
10
|
Returns:
Type | Description |
---|---|
list
|
List of column indices that have more than min_unique unique values |