Skip to content

Genetic encoders

This file contains the class definition for all the genetic encoders. All the genetic encoder classes inherit the Scikit learn BaseEstimator and TransformerMixin classes to follow the Scikit learn paradigm.

DominantEncoder

Bases: BaseEstimator, TransformerMixin

This class contains the function definition for encoding the input features as a Dominant genetic model. The encoding used is AA(0)->1, Aa(1)->1, aa(2)->0.

Source code in tpot2/builtin_modules/genetic_encoders.py
class DominantEncoder(BaseEstimator, TransformerMixin):
    """This class contains the function definition for encoding the input features as a Dominant genetic model.
    The encoding used is AA(0)->1, Aa(1)->1, aa(2)->0. """

    def fit(self, X, y=None):
        """Do nothing and return the estimator unchanged.
        Dummy function to fit in with the sklearn API and hence work in pipelines.

        Parameters
        ----------
        X : array-like
        """
        return self

    def transform(self, X, y=None):
        """Transform the data by applying the Dominant encoding.

        Parameters
        ----------
        X : numpy ndarray, {n_samples, n_components}
            New data, where n_samples is the number of samples (number of individuals)
            and n_components is the number of components (number of features).
        y : None
            Unused

        Returns
        -------
        X_transformed: numpy ndarray, {n_samples, n_components}
            The encoded feature set
        """
        X = check_array(X)
        map = {0: 1, 1: 1, 2: 0}
        mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

        X_transformed = mapping_function(X)

        return X_transformed

fit(X, y=None)

Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.

Parameters:

Name Type Description Default
X array - like
required
Source code in tpot2/builtin_modules/genetic_encoders.py
def fit(self, X, y=None):
    """Do nothing and return the estimator unchanged.
    Dummy function to fit in with the sklearn API and hence work in pipelines.

    Parameters
    ----------
    X : array-like
    """
    return self

transform(X, y=None)

Transform the data by applying the Dominant encoding.

Parameters:

Name Type Description Default
X numpy ndarray, {n_samples, n_components}

New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features).

required
y None

Unused

None

Returns:

Name Type Description
X_transformed numpy ndarray, {n_samples, n_components}

The encoded feature set

Source code in tpot2/builtin_modules/genetic_encoders.py
def transform(self, X, y=None):
    """Transform the data by applying the Dominant encoding.

    Parameters
    ----------
    X : numpy ndarray, {n_samples, n_components}
        New data, where n_samples is the number of samples (number of individuals)
        and n_components is the number of components (number of features).
    y : None
        Unused

    Returns
    -------
    X_transformed: numpy ndarray, {n_samples, n_components}
        The encoded feature set
    """
    X = check_array(X)
    map = {0: 1, 1: 1, 2: 0}
    mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

    X_transformed = mapping_function(X)

    return X_transformed

HeterosisEncoder

Bases: BaseEstimator, TransformerMixin

This class contains the function definition for encoding the input features as a Heterozygote Advantage genetic model. The encoding used is AA(0)->0, Aa(1)->1, aa(2)->0.

Source code in tpot2/builtin_modules/genetic_encoders.py
class HeterosisEncoder(BaseEstimator, TransformerMixin):
    """This class contains the function definition for encoding the input features as a Heterozygote Advantage genetic model.
    The encoding used is AA(0)->0, Aa(1)->1, aa(2)->0. """

    def fit(self, X, y=None):
        """Do nothing and return the estimator unchanged.
        Dummy function to fit in with the sklearn API and hence work in pipelines.

        Parameters
        ----------
        X : array-like
        """
        return self

    def transform(self, X, y=None):
        """Transform the data by applying the Heterosis encoding.

        Parameters
        ----------
        X : numpy ndarray, {n_samples, n_components}
            New data, where n_samples is the number of samples (number of individuals)
            and n_components is the number of components (number of features).
        y : None
            Unused

        Returns
        -------
        X_transformed: numpy ndarray, {n_samples, n_components}
            The encoded feature set
        """
        X = check_array(X)
        map = {0: 0, 1: 1, 2: 0}
        mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

        X_transformed = mapping_function(X)

        return X_transformed

fit(X, y=None)

Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.

Parameters:

Name Type Description Default
X array - like
required
Source code in tpot2/builtin_modules/genetic_encoders.py
def fit(self, X, y=None):
    """Do nothing and return the estimator unchanged.
    Dummy function to fit in with the sklearn API and hence work in pipelines.

    Parameters
    ----------
    X : array-like
    """
    return self

transform(X, y=None)

Transform the data by applying the Heterosis encoding.

Parameters:

Name Type Description Default
X numpy ndarray, {n_samples, n_components}

New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features).

required
y None

Unused

None

Returns:

Name Type Description
X_transformed numpy ndarray, {n_samples, n_components}

The encoded feature set

Source code in tpot2/builtin_modules/genetic_encoders.py
def transform(self, X, y=None):
    """Transform the data by applying the Heterosis encoding.

    Parameters
    ----------
    X : numpy ndarray, {n_samples, n_components}
        New data, where n_samples is the number of samples (number of individuals)
        and n_components is the number of components (number of features).
    y : None
        Unused

    Returns
    -------
    X_transformed: numpy ndarray, {n_samples, n_components}
        The encoded feature set
    """
    X = check_array(X)
    map = {0: 0, 1: 1, 2: 0}
    mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

    X_transformed = mapping_function(X)

    return X_transformed

OverDominanceEncoder

Bases: BaseEstimator, TransformerMixin

This class contains the function definition for encoding the input features as a Over Dominance genetic model. The encoding used is AA(0)->1, Aa(1)->2, aa(2)->0.

Source code in tpot2/builtin_modules/genetic_encoders.py
class OverDominanceEncoder(BaseEstimator, TransformerMixin):
    """This class contains the function definition for encoding the input features as a Over Dominance genetic model.
    The encoding used is AA(0)->1, Aa(1)->2, aa(2)->0. """

    def fit(self, X, y=None):
        """Do nothing and return the estimator unchanged.
        Dummy function to fit in with the sklearn API and hence work in pipelines.

        Parameters
        ----------
        X : array-like
        """
        return self

    def transform(self, X, y=None):
        """Transform the data by applying the Heterosis encoding.

        Parameters
        ----------
        X : numpy ndarray, {n_samples, n_components}
            New data, where n_samples is the number of samples (number of individuals)
            and n_components is the number of components (number of features).
        y : None
            Unused

        Returns
        -------
        X_transformed: numpy ndarray, {n_samples, n_components}
            The encoded feature set
        """
        X = check_array(X)
        map = {0: 1, 1: 2, 2: 0}
        mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

        X_transformed = mapping_function(X)

        return X_transformed

fit(X, y=None)

Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.

Parameters:

Name Type Description Default
X array - like
required
Source code in tpot2/builtin_modules/genetic_encoders.py
def fit(self, X, y=None):
    """Do nothing and return the estimator unchanged.
    Dummy function to fit in with the sklearn API and hence work in pipelines.

    Parameters
    ----------
    X : array-like
    """
    return self

transform(X, y=None)

Transform the data by applying the Heterosis encoding.

Parameters:

Name Type Description Default
X numpy ndarray, {n_samples, n_components}

New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features).

required
y None

Unused

None

Returns:

Name Type Description
X_transformed numpy ndarray, {n_samples, n_components}

The encoded feature set

Source code in tpot2/builtin_modules/genetic_encoders.py
def transform(self, X, y=None):
    """Transform the data by applying the Heterosis encoding.

    Parameters
    ----------
    X : numpy ndarray, {n_samples, n_components}
        New data, where n_samples is the number of samples (number of individuals)
        and n_components is the number of components (number of features).
    y : None
        Unused

    Returns
    -------
    X_transformed: numpy ndarray, {n_samples, n_components}
        The encoded feature set
    """
    X = check_array(X)
    map = {0: 1, 1: 2, 2: 0}
    mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

    X_transformed = mapping_function(X)

    return X_transformed

RecessiveEncoder

Bases: BaseEstimator, TransformerMixin

This class contains the function definition for encoding the input features as a Recessive genetic model. The encoding used is AA(0)->0, Aa(1)->1, aa(2)->1.

Source code in tpot2/builtin_modules/genetic_encoders.py
class RecessiveEncoder(BaseEstimator, TransformerMixin):
    """This class contains the function definition for encoding the input features as a Recessive genetic model.
    The encoding used is AA(0)->0, Aa(1)->1, aa(2)->1. """

    def fit(self, X, y=None):
        """Do nothing and return the estimator unchanged.
        Dummy function to fit in with the sklearn API and hence work in pipelines.

        Parameters
        ----------
        X : array-like
        """
        return self

    def transform(self, X, y=None):
        """Transform the data by applying the Recessive encoding.

        Parameters
        ----------
        X : numpy ndarray, {n_samples, n_components}
            New data, where n_samples is the number of samples (number of individuals)
            and n_components is the number of components (number of features).
        y : None
            Unused

        Returns
        -------
        X_transformed: numpy ndarray, {n_samples, n_components}
            The encoded feature set
        """
        X = check_array(X)
        map = {0: 0, 1: 1, 2: 1}
        mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

        X_transformed = mapping_function(X)

        return X_transformed

fit(X, y=None)

Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.

Parameters:

Name Type Description Default
X array - like
required
Source code in tpot2/builtin_modules/genetic_encoders.py
def fit(self, X, y=None):
    """Do nothing and return the estimator unchanged.
    Dummy function to fit in with the sklearn API and hence work in pipelines.

    Parameters
    ----------
    X : array-like
    """
    return self

transform(X, y=None)

Transform the data by applying the Recessive encoding.

Parameters:

Name Type Description Default
X numpy ndarray, {n_samples, n_components}

New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features).

required
y None

Unused

None

Returns:

Name Type Description
X_transformed numpy ndarray, {n_samples, n_components}

The encoded feature set

Source code in tpot2/builtin_modules/genetic_encoders.py
def transform(self, X, y=None):
    """Transform the data by applying the Recessive encoding.

    Parameters
    ----------
    X : numpy ndarray, {n_samples, n_components}
        New data, where n_samples is the number of samples (number of individuals)
        and n_components is the number of components (number of features).
    y : None
        Unused

    Returns
    -------
    X_transformed: numpy ndarray, {n_samples, n_components}
        The encoded feature set
    """
    X = check_array(X)
    map = {0: 0, 1: 1, 2: 1}
    mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

    X_transformed = mapping_function(X)

    return X_transformed

UnderDominanceEncoder

Bases: BaseEstimator, TransformerMixin

This class contains the function definition for encoding the input features as a Under Dominance genetic model. The encoding used is AA(0)->2, Aa(1)->0, aa(2)->1.

Source code in tpot2/builtin_modules/genetic_encoders.py
class UnderDominanceEncoder(BaseEstimator, TransformerMixin):
    """This class contains the function definition for encoding the input features as a Under Dominance genetic model.
    The encoding used is AA(0)->2, Aa(1)->0, aa(2)->1. """

    def fit(self, X, y=None):
        """Do nothing and return the estimator unchanged.
        Dummy function to fit in with the sklearn API and hence work in pipelines.

        Parameters
        ----------
        X : array-like
        """
        return self

    def transform(self, X, y=None):
        """Transform the data by applying the Heterosis encoding.

        Parameters
        ----------
        X : numpy ndarray, {n_samples, n_components}
            New data, where n_samples is the number of samples (number of individuals)
            and n_components is the number of components (number of features).
        y : None
            Unused

        Returns
        -------
        X_transformed: numpy ndarray, {n_samples, n_components}
            The encoded feature set
        """
        X = check_array(X)
        map = {0: 2, 1: 0, 2: 1}
        mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

        X_transformed = mapping_function(X)

        return X_transformed

fit(X, y=None)

Do nothing and return the estimator unchanged. Dummy function to fit in with the sklearn API and hence work in pipelines.

Parameters:

Name Type Description Default
X array - like
required
Source code in tpot2/builtin_modules/genetic_encoders.py
def fit(self, X, y=None):
    """Do nothing and return the estimator unchanged.
    Dummy function to fit in with the sklearn API and hence work in pipelines.

    Parameters
    ----------
    X : array-like
    """
    return self

transform(X, y=None)

Transform the data by applying the Heterosis encoding.

Parameters:

Name Type Description Default
X numpy ndarray, {n_samples, n_components}

New data, where n_samples is the number of samples (number of individuals) and n_components is the number of components (number of features).

required
y None

Unused

None

Returns:

Name Type Description
X_transformed numpy ndarray, {n_samples, n_components}

The encoded feature set

Source code in tpot2/builtin_modules/genetic_encoders.py
def transform(self, X, y=None):
    """Transform the data by applying the Heterosis encoding.

    Parameters
    ----------
    X : numpy ndarray, {n_samples, n_components}
        New data, where n_samples is the number of samples (number of individuals)
        and n_components is the number of components (number of features).
    y : None
        Unused

    Returns
    -------
    X_transformed: numpy ndarray, {n_samples, n_components}
        The encoded feature set
    """
    X = check_array(X)
    map = {0: 2, 1: 0, 2: 1}
    mapping_function = np.vectorize(lambda i: map[i] if i in map else i)

    X_transformed = mapping_function(X)

    return X_transformed