Nsgaii

This file is part of the TPOT library.

The current version of TPOT was developed at Cedars-Sinai by: - Pedro Henrique Ribeiro (https://github.com/perib, https://www.linkedin.com/in/pedro-ribeiro/) - Anil Saini (anil.saini@cshs.org) - Jose Hernandez (jgh9094@gmail.com) - Jay Moran (jay.moran@cshs.org) - Nicholas Matsumoto (nicholas.matsumoto@cshs.org) - Hyunjun Choi (hyunjun.choi@cshs.org) - Miguel E. Hernandez (miguel.e.hernandez@cshs.org) - Jason Moore (moorejh28@gmail.com)

The original version of TPOT was primarily developed at the University of Pennsylvania by: - Randal S. Olson (rso@randalolson.com) - Weixuan Fu (weixuanf@upenn.edu) - Daniel Angell (dpa34@drexel.edu) - Jason Moore (moorejh28@gmail.com) - and many more generous open-source contributors

TPOT is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

TPOT is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with TPOT. If not, see http://www.gnu.org/licenses/.

`crowding_distance(matrix)` ¶

Takes a matrix of scores and returns the crowding distance for each point.

Parameters:

Name	Type	Description	Default
`matrix`	`ndarray`	The score matrix, where rows the individuals and the columns are the corresponds to scores on different objectives.	required

Returns:

Type	Description
`list`	A list of the crowding distances for each point in the score matrix.

Source code in tpot2/selectors/nsgaii.py

def crowding_distance(matrix):
    """
    Takes a matrix of scores and returns the crowding distance for each point.

    Parameters
    ----------
    matrix : np.ndarray
        The score matrix, where rows the individuals and the columns are the corresponds to scores on different objectives.

    Returns
    -------
    list
        A list of the crowding distances for each point in the score matrix.
    """
    matrix = np.array(matrix)
    # Initialize the crowding distance for each point to zero
    crowding_distances = [0 for _ in range(len(matrix))]

    # Iterate over each objective
    for objective_i in range(matrix.shape[1]):
        # Sort the points according to the current objective
        sorted_i = matrix[:, objective_i].argsort()

        # Set the crowding distance of the first and last points to infinity
        crowding_distances[sorted_i[0]] = float("inf")
        crowding_distances[sorted_i[-1]] = float("inf")

        if matrix[sorted_i[0]][objective_i] == matrix[sorted_i[-1]][objective_i]: # https://github.com/DEAP/deap/blob/f2a570567fa3dce156d7cfb0c50bc72f133258a1/deap/tools/emo.py#L135
            continue

        norm = matrix.shape[1] * float(matrix[sorted_i[0]][objective_i] - matrix[sorted_i[-1]][objective_i])
        for prev, cur, following in zip(sorted_i[:-2], sorted_i[1:-1], sorted_i[2:]):
            crowding_distances[cur] += (matrix[following][objective_i] - matrix[prev][objective_i]) / norm


    return crowding_distances

`dominates(list1, list2)` ¶

returns true is all values in list1 are not strictly worse than list2 AND at least one item in list1 is better than list2

Parameters:

Name	Type	Description	Default
`list1`	`list`	The first list of values to compare.	required
`list2`	`list`	The second list of values to compare.	required

Returns:

Type	Description
`bool`	True if all values in list1 are not strictly worse than list2 AND at least one item in list1 is better than list2, False otherwise.

Source code in tpot2/selectors/nsgaii.py

def dominates(list1, list2):
    """
    returns true is all values in list1 are not strictly worse than list2 AND at least one item in list1 is better than list2

    Parameters
    ----------
    list1 : list
        The first list of values to compare.
    list2 : list
        The second list of values to compare.

    Returns
    -------
    bool
        True if all values in list1 are not strictly worse than list2 AND at least one item in list1 is better than list2, False otherwise.

    """
    return all(list1[i] >= list2[i] for i in range(len(list1))) and any(list1[i] > list2[i] for i in range(len(list1)))

`nondominated_sorting(matrix)` ¶

Returns the indices of the non-dominated rows in the scores matrix. Rows are considered samples, and columns are considered objectives.

Parameters:

Name	Type	Description	Default
`matrix`	`ndarray`	The score matrix, where rows the individuals and the columns are the corresponds to scores on different objectives.	required

Returns:

Type	Description
`list`	A list of lists of indices of the non-dominated rows in the scores matrix.

Source code in tpot2/selectors/nsgaii.py

def nondominated_sorting(matrix):
    """
    Returns the indices of the non-dominated rows in the scores matrix.
    Rows are considered samples, and columns are considered objectives.

    Parameters
    ----------
    matrix : np.ndarray
        The score matrix, where rows the individuals and the columns are the corresponds to scores on different objectives.

    Returns
    -------
    list
        A list of lists of indices of the non-dominated rows in the scores matrix.

    """
    # Initialize the front list and the rank list

    # Initialize the current front
    fronts = {0:set()}

    # Initialize the list of dominated points
    dominated = [set() for _ in range(len(matrix))] #si the set of solutions which solution i dominates

    # Initialize the list of points that dominate the current point
    dominating = [0 for _ in range(len(matrix))] #ni the number of solutions that denominate solution i


    # Iterate over all points
    for p, p_scores in enumerate(matrix):
        # Iterate over all other points
        for q, q_scores in enumerate(matrix):
            # If the current point dominates the other point, increment the count of points dominated by the current point
            if dominates(p_scores, q_scores):
                dominated[p].add(q)
            # If the current point is dominated by the other point, add it to the list of dominated points
            elif dominates(q_scores, p_scores):
                dominating[p] += 1

        if dominating[p] == 0:
            fronts[0].add(p)

    i=0

    # Iterate until all points have been added to a front
    while len(fronts[i]) > 0:
        H = set()
        for p in fronts[i]:
            for q in dominated[p]:
                dominating[q] -= 1
                if dominating[q] == 0:
                    H.add(q)

        i += 1
        fronts[i] = H


    return [fronts[j] for j in range(i)]

`survival_select_NSGA2(scores, k, rng=None)` ¶

Select the top k individuals from the scores matrix using the NSGA-II algorithm.

Parameters:

Name	Type	Description	Default
`scores`	`ndarray`	The score matrix, where rows the individuals and the columns are the corresponds to scores on different objectives.	required
`k`	`int`	The number of individuals to select.	required
`rng`	`(int, Generator)`	The random number generator. The default is None.	`None`

Returns:

Type	Description
`list`	A list of indices of the selected individuals (without repeats).

Source code in tpot2/selectors/nsgaii.py

def survival_select_NSGA2(scores, k, rng=None):
    """
    Select the top k individuals from the scores matrix using the NSGA-II algorithm.

    Parameters
    ----------
    scores : np.ndarray
        The score matrix, where rows the individuals and the columns are the corresponds to scores on different objectives.
    k : int
        The number of individuals to select.
    rng : int, np.random.Generator, optional
        The random number generator. The default is None.

    Returns
    -------
    list
        A list of indices of the selected individuals (without repeats).

    """

    pareto_fronts = nondominated_sorting(scores)

    # chosen = list(itertools.chain.from_iterable(fronts))
    # if len(chosen) >= k:
    #     return chosen[0:k]

    chosen = []
    current_front_number = 0
    while len(chosen) < k and current_front_number < len(pareto_fronts):

        current_front = np.array(list(pareto_fronts[current_front_number]))
        front_scores = [scores[i] for i in current_front]
        crowding_distances = crowding_distance(front_scores)

        sorted_indeces = current_front[np.argsort(crowding_distances)[::-1]]

        chosen.extend(sorted_indeces[0:(k-len(chosen))])

        current_front_number += 1

    return chosen

Nsgaii

crowding_distance(matrix) ¶

dominates(list1, list2) ¶

nondominated_sorting(matrix) ¶

survival_select_NSGA2(scores, k, rng=None) ¶

`crowding_distance(matrix)` ¶

`dominates(list1, list2)` ¶

`nondominated_sorting(matrix)` ¶

`survival_select_NSGA2(scores, k, rng=None)` ¶