Genetic encoders

Code from https://github.com/EpistasisLab/autoqtl This file contains the class definition for all the genetic encoders. All the genetic encoder classes inherit the Scikit learn BaseEstimator and TransformerMixin classes to follow the Scikit-learn paradigm.

`DominantEncoder` ¶

Bases: BaseEstimator, TransformerMixin

This class contains the function definition for encoding the input features as a Dominant genetic model. The encoding used is AA(0)->1, Aa(1)->1, aa(2)->0.

Source code in tpot2/builtin_modules/genetic_encoders.py

class DominantEncoder(BaseEstimator, TransformerMixin):
    """This class contains the function definition for encoding the input features as a Dominant genetic model.
    The encoding used is AA(0)->1, Aa(1)->1, aa(2)->0. """

    def fit(self, X, y=None):
        """Do nothing and return the estimator unchanged.
        Dummy function to fit in with the sklearn API and hence work in pipelines.

        Parameters
        ----------
        X : array-like
        """
        return self

    def transform(self, X, y=None):
        """Transform the data by applying the Dominant encoding.

        Parameters
        ----------
        X : numpy ndarray, {n_samples, n_components}
            New data, where n_samples is the number of samples (number of individuals)
            and n_components is the number of components (number of features).
        y : None
            Unused

        Returns
        -------
        X_transformed: numpy ndarray, {n_samples, n_components}
            The encoded feature set
        """
        X = check_array(X)
        map = {0: 1, 1: 1, 2: 0}
        mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

        X_transformed = mapping_function(X)

        return X_transformed

`fit(X, y=None)` ¶

Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.

Parameters:

Name	Type	Description	Default
`X`	`array - like`		required

Source code in tpot2/builtin_modules/genetic_encoders.py

def fit(self, X, y=None):
    """Do nothing and return the estimator unchanged.
    Dummy function to fit in with the sklearn API and hence work in pipelines.

    Parameters
    ----------
    X : array-like
    """
    return self

`transform(X, y=None)` ¶

Transform the data by applying the Dominant encoding.

Parameters:

Name	Type	Description	Default
`X`	`numpy ndarray, {n_samples, n_components}`	New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features).	required
`y`	`None`	Unused	`None`

Returns:

Name	Type	Description
`X_transformed`	`numpy ndarray, {n_samples, n_components}`	The encoded feature set

Source code in tpot2/builtin_modules/genetic_encoders.py

def transform(self, X, y=None):
    """Transform the data by applying the Dominant encoding.

    Parameters
    ----------
    X : numpy ndarray, {n_samples, n_components}
        New data, where n_samples is the number of samples (number of individuals)
        and n_components is the number of components (number of features).
    y : None
        Unused

    Returns
    -------
    X_transformed: numpy ndarray, {n_samples, n_components}
        The encoded feature set
    """
    X = check_array(X)
    map = {0: 1, 1: 1, 2: 0}
    mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

    X_transformed = mapping_function(X)

    return X_transformed

`HeterosisEncoder` ¶

Bases: BaseEstimator, TransformerMixin

This class contains the function definition for encoding the input features as a Heterozygote Advantage genetic model. The encoding used is AA(0)->0, Aa(1)->1, aa(2)->0.

Source code in tpot2/builtin_modules/genetic_encoders.py

class HeterosisEncoder(BaseEstimator, TransformerMixin):
    """This class contains the function definition for encoding the input features as a Heterozygote Advantage genetic model.
    The encoding used is AA(0)->0, Aa(1)->1, aa(2)->0. """

    def fit(self, X, y=None):
        """Do nothing and return the estimator unchanged.
        Dummy function to fit in with the sklearn API and hence work in pipelines.

        Parameters
        ----------
        X : array-like
        """
        return self

    def transform(self, X, y=None):
        """Transform the data by applying the Heterosis encoding.

        Parameters
        ----------
        X : numpy ndarray, {n_samples, n_components}
            New data, where n_samples is the number of samples (number of individuals)
            and n_components is the number of components (number of features).
        y : None
            Unused

        Returns
        -------
        X_transformed: numpy ndarray, {n_samples, n_components}
            The encoded feature set
        """
        X = check_array(X)
        map = {0: 0, 1: 1, 2: 0}
        mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

        X_transformed = mapping_function(X)

        return X_transformed

`fit(X, y=None)` ¶

Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.

Parameters:

Name	Type	Description	Default
`X`	`array - like`		required

Source code in tpot2/builtin_modules/genetic_encoders.py

def fit(self, X, y=None):
    """Do nothing and return the estimator unchanged.
    Dummy function to fit in with the sklearn API and hence work in pipelines.

    Parameters
    ----------
    X : array-like
    """
    return self

`transform(X, y=None)` ¶

Transform the data by applying the Heterosis encoding.

Parameters:

Name	Type	Description	Default
`X`	`numpy ndarray, {n_samples, n_components}`	New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features).	required
`y`	`None`	Unused	`None`

Returns:

Name	Type	Description
`X_transformed`	`numpy ndarray, {n_samples, n_components}`	The encoded feature set

Source code in tpot2/builtin_modules/genetic_encoders.py

def transform(self, X, y=None):
    """Transform the data by applying the Heterosis encoding.

    Parameters
    ----------
    X : numpy ndarray, {n_samples, n_components}
        New data, where n_samples is the number of samples (number of individuals)
        and n_components is the number of components (number of features).
    y : None
        Unused

    Returns
    -------
    X_transformed: numpy ndarray, {n_samples, n_components}
        The encoded feature set
    """
    X = check_array(X)
    map = {0: 0, 1: 1, 2: 0}
    mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

    X_transformed = mapping_function(X)

    return X_transformed

`OverDominanceEncoder` ¶

Bases: BaseEstimator, TransformerMixin

This class contains the function definition for encoding the input features as a Over Dominance genetic model. The encoding used is AA(0)->1, Aa(1)->2, aa(2)->0.

Source code in tpot2/builtin_modules/genetic_encoders.py

class OverDominanceEncoder(BaseEstimator, TransformerMixin):
    """This class contains the function definition for encoding the input features as a Over Dominance genetic model.
    The encoding used is AA(0)->1, Aa(1)->2, aa(2)->0. """

    def fit(self, X, y=None):
        """Do nothing and return the estimator unchanged.
        Dummy function to fit in with the sklearn API and hence work in pipelines.

        Parameters
        ----------
        X : array-like
        """
        return self

    def transform(self, X, y=None):
        """Transform the data by applying the Heterosis encoding.

        Parameters
        ----------
        X : numpy ndarray, {n_samples, n_components}
            New data, where n_samples is the number of samples (number of individuals)
            and n_components is the number of components (number of features).
        y : None
            Unused

        Returns
        -------
        X_transformed: numpy ndarray, {n_samples, n_components}
            The encoded feature set
        """
        X = check_array(X)
        map = {0: 1, 1: 2, 2: 0}
        mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

        X_transformed = mapping_function(X)

        return X_transformed

`fit(X, y=None)` ¶

Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.

Parameters:

Name	Type	Description	Default
`X`	`array - like`		required

Source code in tpot2/builtin_modules/genetic_encoders.py

def fit(self, X, y=None):
    """Do nothing and return the estimator unchanged.
    Dummy function to fit in with the sklearn API and hence work in pipelines.

    Parameters
    ----------
    X : array-like
    """
    return self

`transform(X, y=None)` ¶

Transform the data by applying the Heterosis encoding.

Parameters:

Name	Type	Description	Default
`X`	`numpy ndarray, {n_samples, n_components}`	New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features).	required
`y`	`None`	Unused	`None`

Returns:

Name	Type	Description
`X_transformed`	`numpy ndarray, {n_samples, n_components}`	The encoded feature set

Source code in tpot2/builtin_modules/genetic_encoders.py

def transform(self, X, y=None):
    """Transform the data by applying the Heterosis encoding.

    Parameters
    ----------
    X : numpy ndarray, {n_samples, n_components}
        New data, where n_samples is the number of samples (number of individuals)
        and n_components is the number of components (number of features).
    y : None
        Unused

    Returns
    -------
    X_transformed: numpy ndarray, {n_samples, n_components}
        The encoded feature set
    """
    X = check_array(X)
    map = {0: 1, 1: 2, 2: 0}
    mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

    X_transformed = mapping_function(X)

    return X_transformed

`RecessiveEncoder` ¶

Bases: BaseEstimator, TransformerMixin

This class contains the function definition for encoding the input features as a Recessive genetic model. The encoding used is AA(0)->0, Aa(1)->1, aa(2)->1.

Source code in tpot2/builtin_modules/genetic_encoders.py

class RecessiveEncoder(BaseEstimator, TransformerMixin):
    """This class contains the function definition for encoding the input features as a Recessive genetic model.
    The encoding used is AA(0)->0, Aa(1)->1, aa(2)->1. """

    def fit(self, X, y=None):
        """Do nothing and return the estimator unchanged.
        Dummy function to fit in with the sklearn API and hence work in pipelines.

        Parameters
        ----------
        X : array-like
        """
        return self

    def transform(self, X, y=None):
        """Transform the data by applying the Recessive encoding.

        Parameters
        ----------
        X : numpy ndarray, {n_samples, n_components}
            New data, where n_samples is the number of samples (number of individuals)
            and n_components is the number of components (number of features).
        y : None
            Unused

        Returns
        -------
        X_transformed: numpy ndarray, {n_samples, n_components}
            The encoded feature set
        """
        X = check_array(X)
        map = {0: 0, 1: 1, 2: 1}
        mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

        X_transformed = mapping_function(X)

        return X_transformed

`fit(X, y=None)` ¶

Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.

Parameters:

Name	Type	Description	Default
`X`	`array - like`		required

Source code in tpot2/builtin_modules/genetic_encoders.py

def fit(self, X, y=None):
    """Do nothing and return the estimator unchanged.
    Dummy function to fit in with the sklearn API and hence work in pipelines.

    Parameters
    ----------
    X : array-like
    """
    return self

`transform(X, y=None)` ¶

Transform the data by applying the Recessive encoding.

Parameters:

Name	Type	Description	Default
`X`	`numpy ndarray, {n_samples, n_components}`	New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features).	required
`y`	`None`	Unused	`None`

Returns:

Name	Type	Description
`X_transformed`	`numpy ndarray, {n_samples, n_components}`	The encoded feature set

Source code in tpot2/builtin_modules/genetic_encoders.py

def transform(self, X, y=None):
    """Transform the data by applying the Recessive encoding.

    Parameters
    ----------
    X : numpy ndarray, {n_samples, n_components}
        New data, where n_samples is the number of samples (number of individuals)
        and n_components is the number of components (number of features).
    y : None
        Unused

    Returns
    -------
    X_transformed: numpy ndarray, {n_samples, n_components}
        The encoded feature set
    """
    X = check_array(X)
    map = {0: 0, 1: 1, 2: 1}
    mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

    X_transformed = mapping_function(X)

    return X_transformed

`UnderDominanceEncoder` ¶

Bases: BaseEstimator, TransformerMixin

This class contains the function definition for encoding the input features as a Under Dominance genetic model. The encoding used is AA(0)->2, Aa(1)->0, aa(2)->1.

Source code in tpot2/builtin_modules/genetic_encoders.py

class UnderDominanceEncoder(BaseEstimator, TransformerMixin):
    """This class contains the function definition for encoding the input features as a Under Dominance genetic model.
    The encoding used is AA(0)->2, Aa(1)->0, aa(2)->1. """

    def fit(self, X, y=None):
        """Do nothing and return the estimator unchanged.
        Dummy function to fit in with the sklearn API and hence work in pipelines.

        Parameters
        ----------
        X : array-like
        """
        return self

    def transform(self, X, y=None):
        """Transform the data by applying the Heterosis encoding.

        Parameters
        ----------
        X : numpy ndarray, {n_samples, n_components}
            New data, where n_samples is the number of samples (number of individuals)
            and n_components is the number of components (number of features).
        y : None
            Unused

        Returns
        -------
        X_transformed: numpy ndarray, {n_samples, n_components}
            The encoded feature set
        """
        X = check_array(X)
        map = {0: 2, 1: 0, 2: 1}
        mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

        X_transformed = mapping_function(X)

        return X_transformed

`fit(X, y=None)` ¶

Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.

Parameters:

Name	Type	Description	Default
`X`	`array - like`		required

Source code in tpot2/builtin_modules/genetic_encoders.py

def fit(self, X, y=None):
    """Do nothing and return the estimator unchanged.
    Dummy function to fit in with the sklearn API and hence work in pipelines.

    Parameters
    ----------
    X : array-like
    """
    return self

`transform(X, y=None)` ¶

Transform the data by applying the Heterosis encoding.

Parameters:

Name	Type	Description	Default
`X`	`numpy ndarray, {n_samples, n_components}`	New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features).	required
`y`	`None`	Unused	`None`

Returns:

Name	Type	Description
`X_transformed`	`numpy ndarray, {n_samples, n_components}`	The encoded feature set

Source code in tpot2/builtin_modules/genetic_encoders.py

def transform(self, X, y=None):
    """Transform the data by applying the Heterosis encoding.

    Parameters
    ----------
    X : numpy ndarray, {n_samples, n_components}
        New data, where n_samples is the number of samples (number of individuals)
        and n_components is the number of components (number of features).
    y : None
        Unused

    Returns
    -------
    X_transformed: numpy ndarray, {n_samples, n_components}
        The encoded feature set
    """
    X = check_array(X)
    map = {0: 2, 1: 0, 2: 1}
    mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

    X_transformed = mapping_function(X)

    return X_transformed

Genetic encoders

DominantEncoder ¶

fit(X, y=None) ¶

transform(X, y=None) ¶

HeterosisEncoder ¶

fit(X, y=None) ¶

transform(X, y=None) ¶

OverDominanceEncoder ¶

fit(X, y=None) ¶

transform(X, y=None) ¶

RecessiveEncoder ¶

fit(X, y=None) ¶

transform(X, y=None) ¶

UnderDominanceEncoder ¶

fit(X, y=None) ¶

transform(X, y=None) ¶

`DominantEncoder` ¶

`fit(X, y=None)` ¶

`transform(X, y=None)` ¶

`HeterosisEncoder` ¶

`fit(X, y=None)` ¶

`transform(X, y=None)` ¶

`OverDominanceEncoder` ¶

`fit(X, y=None)` ¶

`transform(X, y=None)` ¶

`RecessiveEncoder` ¶

`fit(X, y=None)` ¶

`transform(X, y=None)` ¶

`UnderDominanceEncoder` ¶

`fit(X, y=None)` ¶

`transform(X, y=None)` ¶