跳到内容

Nsgaii

此文件是 TPOT 库的一部分。

当前版本的 TPOT 由 Cedars-Sinai 的以下人员开发:- Pedro Henrique Ribeiro (https://github.com/perib, https://www.linkedin.com/in/pedro-ribeiro/) - Anil Saini (anil.saini@cshs.org) - Jose Hernandez (jgh9094@gmail.com) - Jay Moran (jay.moran@cshs.org) - Nicholas Matsumoto (nicholas.matsumoto@cshs.org) - Hyunjun Choi (hyunjun.choi@cshs.org) - Gabriel Ketron (gabriel.ketron@cshs.org) - Miguel E. Hernandez (miguel.e.hernandez@cshs.org) - Jason Moore (moorejh28@gmail.com)

TPOT 的原始版本主要由宾夕法尼亚大学的以下人员开发:- Randal S. Olson (rso@randalolson.com) - Weixuan Fu (weixuanf@upenn.edu) - Daniel Angell (dpa34@drexel.edu) - Jason Moore (moorejh28@gmail.com) - 以及更多慷慨的开源贡献者

TPOT 是自由软件:您可以根据自由软件基金会发布的 GNU 宽通用公共许可证(第三版或您选择的任何更高版本)的条款重新分发和/或修改它。

分发 TPOT 是希望它会有用,但没有任何担保;甚至不包括适销性或特定用途适用性的默示担保。详情请参阅 GNU 宽通用公共许可证。

您应该随 TPOT 一起收到了 GNU 宽通用公共许可证的副本。如果没有,请参见 https://gnu.ac.cn/licenses/

crowding_distance(matrix) 永久链接

接受一个得分矩阵并返回每个点的拥挤距离。

参数

名称 类型 描述 默认值
matrix ndarray

得分矩阵,其中行代表个体,列对应不同目标上的得分。

必需

返回值

类型 描述
list

得分矩阵中每个点的拥挤距离列表。

源代码位于 tpot/selectors/nsgaii.py
def crowding_distance(matrix):
    """
    Takes a matrix of scores and returns the crowding distance for each point.

    Parameters
    ----------
    matrix : np.ndarray
        The score matrix, where rows the individuals and the columns are the corresponds to scores on different objectives.

    Returns
    -------
    list
        A list of the crowding distances for each point in the score matrix.
    """
    matrix = np.array(matrix)
    # Initialize the crowding distance for each point to zero
    crowding_distances = [0 for _ in range(len(matrix))]

    # Iterate over each objective
    for objective_i in range(matrix.shape[1]):
        # Sort the points according to the current objective
        sorted_i = matrix[:, objective_i].argsort()

        # Set the crowding distance of the first and last points to infinity
        crowding_distances[sorted_i[0]] = float("inf")
        crowding_distances[sorted_i[-1]] = float("inf")

        if matrix[sorted_i[0]][objective_i] == matrix[sorted_i[-1]][objective_i]: # https://github.com/DEAP/deap/blob/f2a570567fa3dce156d7cfb0c50bc72f133258a1/deap/tools/emo.py#L135
            continue

        norm = matrix.shape[1] * float(matrix[sorted_i[0]][objective_i] - matrix[sorted_i[-1]][objective_i])
        for prev, cur, following in zip(sorted_i[:-2], sorted_i[1:-1], sorted_i[2:]):
            crowding_distances[cur] += (matrix[following][objective_i] - matrix[prev][objective_i]) / norm


    return crowding_distances

dominates(list1, list2) 永久链接

如果 list1 中的所有值都不严格劣于 list2,并且 list1 中至少有一个值优于 list2,则返回 True

参数

名称 类型 描述 默认值
list1 list

要比较的第一个值列表。

必需
list2 list

要比较的第二个值列表。

必需

返回值

类型 描述
bool

如果 list1 中的所有值都不严格劣于 list2,并且 list1 中至少有一个值优于 list2,则返回 True,否则返回 False。

源代码位于 tpot/selectors/nsgaii.py
def dominates(list1, list2):
    """
    returns true is all values in list1 are not strictly worse than list2 AND at least one item in list1 is better than list2

    Parameters
    ----------
    list1 : list
        The first list of values to compare.
    list2 : list
        The second list of values to compare.

    Returns
    -------
    bool
        True if all values in list1 are not strictly worse than list2 AND at least one item in list1 is better than list2, False otherwise.

    """
    return all(list1[i] >= list2[i] for i in range(len(list1))) and any(list1[i] > list2[i] for i in range(len(list1)))

nondominated_sorting(matrix) 永久链接

返回得分矩阵中非支配行的索引。行被视为样本,列被视为目标。

参数

名称 类型 描述 默认值
matrix ndarray

得分矩阵,其中行代表个体,列对应不同目标上的得分。

必需

返回值

类型 描述
list

得分矩阵中非支配行的索引列表的列表。

源代码位于 tpot/selectors/nsgaii.py
def nondominated_sorting(matrix):
    """
    Returns the indices of the non-dominated rows in the scores matrix.
    Rows are considered samples, and columns are considered objectives.

    Parameters
    ----------
    matrix : np.ndarray
        The score matrix, where rows the individuals and the columns are the corresponds to scores on different objectives.

    Returns
    -------
    list
        A list of lists of indices of the non-dominated rows in the scores matrix.

    """
    # Initialize the front list and the rank list

    # Initialize the current front
    fronts = {0:set()}

    # Initialize the list of dominated points
    dominated = [set() for _ in range(len(matrix))] #si the set of solutions which solution i dominates

    # Initialize the list of points that dominate the current point
    dominating = [0 for _ in range(len(matrix))] #ni the number of solutions that denominate solution i


    # Iterate over all points
    for p, p_scores in enumerate(matrix):
        # Iterate over all other points
        for q, q_scores in enumerate(matrix):
            # If the current point dominates the other point, increment the count of points dominated by the current point
            if dominates(p_scores, q_scores):
                dominated[p].add(q)
            # If the current point is dominated by the other point, add it to the list of dominated points
            elif dominates(q_scores, p_scores):
                dominating[p] += 1

        if dominating[p] == 0:
            fronts[0].add(p)

    i=0

    # Iterate until all points have been added to a front
    while len(fronts[i]) > 0:
        H = set()
        for p in fronts[i]:
            for q in dominated[p]:
                dominating[q] -= 1
                if dominating[q] == 0:
                    H.add(q)

        i += 1
        fronts[i] = H


    return [fronts[j] for j in range(i)]

survival_select_NSGA2(scores, k, rng=None) 永久链接

使用 NSGA-II 算法从得分矩阵中选择排名前 k 的个体。

参数

名称 类型 描述 默认值
scores ndarray

得分矩阵,其中行代表个体,列对应不同目标上的得分。

必需
k int

要选择的个体数量。

必需
rng (int, Generator)

随机数生成器。默认值为 None。

None

返回值

类型 描述
list

选定个体的索引列表(无重复)。

源代码位于 tpot/selectors/nsgaii.py
def survival_select_NSGA2(scores, k, rng=None):
    """
    Select the top k individuals from the scores matrix using the NSGA-II algorithm.

    Parameters
    ----------
    scores : np.ndarray
        The score matrix, where rows the individuals and the columns are the corresponds to scores on different objectives.
    k : int
        The number of individuals to select.
    rng : int, np.random.Generator, optional
        The random number generator. The default is None.

    Returns
    -------
    list
        A list of indices of the selected individuals (without repeats).

    """

    pareto_fronts = nondominated_sorting(scores)

    # chosen = list(itertools.chain.from_iterable(fronts))
    # if len(chosen) >= k:
    #     return chosen[0:k]

    chosen = []
    current_front_number = 0
    while len(chosen) < k and current_front_number < len(pareto_fronts):

        current_front = np.array(list(pareto_fronts[current_front_number]))
        front_scores = [scores[i] for i in current_front]
        crowding_distances = crowding_distance(front_scores)

        sorted_indeces = current_front[np.argsort(crowding_distances)[::-1]]

        chosen.extend(sorted_indeces[0:(k-len(chosen))])

        current_front_number += 1

    return chosen