Genetic encoders
Code from https://github.com/EpistasisLab/autoqtl This file contains the class definition for all the genetic encoders. All the genetic encoder classes inherit the Scikit learn BaseEstimator and TransformerMixin classes to follow the Scikit-learn paradigm.
DominantEncoder
¶
Bases: BaseEstimator
, TransformerMixin
This class contains the function definition for encoding the input features as a Dominant genetic model. The encoding used is AA(0)->1, Aa(1)->1, aa(2)->0.
Source code in tpot2/builtin_modules/genetic_encoders.py
fit(X, y=None)
¶
Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
array - like
|
|
required |
transform(X, y=None)
¶
Transform the data by applying the Dominant encoding.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
numpy ndarray, {n_samples, n_components}
|
New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features). |
required |
y |
None
|
Unused |
None
|
Returns:
Name | Type | Description |
---|---|---|
X_transformed |
numpy ndarray, {n_samples, n_components}
|
The encoded feature set |
Source code in tpot2/builtin_modules/genetic_encoders.py
HeterosisEncoder
¶
Bases: BaseEstimator
, TransformerMixin
This class contains the function definition for encoding the input features as a Heterozygote Advantage genetic model. The encoding used is AA(0)->0, Aa(1)->1, aa(2)->0.
Source code in tpot2/builtin_modules/genetic_encoders.py
fit(X, y=None)
¶
Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
array - like
|
|
required |
transform(X, y=None)
¶
Transform the data by applying the Heterosis encoding.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
numpy ndarray, {n_samples, n_components}
|
New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features). |
required |
y |
None
|
Unused |
None
|
Returns:
Name | Type | Description |
---|---|---|
X_transformed |
numpy ndarray, {n_samples, n_components}
|
The encoded feature set |
Source code in tpot2/builtin_modules/genetic_encoders.py
OverDominanceEncoder
¶
Bases: BaseEstimator
, TransformerMixin
This class contains the function definition for encoding the input features as a Over Dominance genetic model. The encoding used is AA(0)->1, Aa(1)->2, aa(2)->0.
Source code in tpot2/builtin_modules/genetic_encoders.py
fit(X, y=None)
¶
Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
array - like
|
|
required |
transform(X, y=None)
¶
Transform the data by applying the Heterosis encoding.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
numpy ndarray, {n_samples, n_components}
|
New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features). |
required |
y |
None
|
Unused |
None
|
Returns:
Name | Type | Description |
---|---|---|
X_transformed |
numpy ndarray, {n_samples, n_components}
|
The encoded feature set |
Source code in tpot2/builtin_modules/genetic_encoders.py
RecessiveEncoder
¶
Bases: BaseEstimator
, TransformerMixin
This class contains the function definition for encoding the input features as a Recessive genetic model. The encoding used is AA(0)->0, Aa(1)->1, aa(2)->1.
Source code in tpot2/builtin_modules/genetic_encoders.py
fit(X, y=None)
¶
Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
array - like
|
|
required |
transform(X, y=None)
¶
Transform the data by applying the Recessive encoding.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
numpy ndarray, {n_samples, n_components}
|
New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features). |
required |
y |
None
|
Unused |
None
|
Returns:
Name | Type | Description |
---|---|---|
X_transformed |
numpy ndarray, {n_samples, n_components}
|
The encoded feature set |
Source code in tpot2/builtin_modules/genetic_encoders.py
UnderDominanceEncoder
¶
Bases: BaseEstimator
, TransformerMixin
This class contains the function definition for encoding the input features as a Under Dominance genetic model. The encoding used is AA(0)->2, Aa(1)->0, aa(2)->1.
Source code in tpot2/builtin_modules/genetic_encoders.py
fit(X, y=None)
¶
Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
array - like
|
|
required |
transform(X, y=None)
¶
Transform the data by applying the Heterosis encoding.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
numpy ndarray, {n_samples, n_components}
|
New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features). |
required |
y |
None
|
Unused |
None
|
Returns:
Name | Type | Description |
---|---|---|
X_transformed |
numpy ndarray, {n_samples, n_components}
|
The encoded feature set |