交叉熵方法(CEM)优化

liang @ 2024年04月07日

CEM(Cross-Entropy Method)是一种优化算法,通常用于解决连续优化问题。它的基本思想是通过迭代地生成随机样本,并根据样本在目标函数上的表现来更新参数,以找到最优解。下面是一个具体的例子,展示如何使用CEM算法来解决一个简单的连续优化问题。

实例

假设我们要最小化函数 f(x) = x^2 + 2x + 1,其中 x 是实数。我们想找到使函数 f(x) 取得最小值的 x 值。

import numpy as np

# 定义目标函数
def f(x):
    return x**2 + 2*x + 1

# 定义CEM算法的参数
num_samples = 100   # 每次迭代生成的样本数量
elite_frac = 0.2    # 每次迭代保留下来的精英个数的比例前20%
learning_rate = 0.1 # 每次迭代的学习率

# 定义初始解集
mean = 0.0
std = 1.0
solutions = np.random.normal(mean, std, num_samples)

# 接下来,我们执行CEM算法的迭代过程。在每次迭代中,我们生成样本,
# 计算他们在目标函数上的表现,选择表现最好的一部分样本,并使用他
# 们来更新解集的均值和标准差
num_iterations = 10 # 迭代次数

for i in range(num_iterations):
    # 生成样本
    samples = np.random.normal(mean, std, num_samples)
    
    # 计算样本在目标函数上的表现
    scores = f(samples)

    # 选择表现最好的一部分样本
    elite_size = int(num_samples * elite_frac)
    elite_indices = np.argsort(scores)[:elite_size]
    elite_samples = samples[elite_indices]

    # 更新解集的均值和标准差
    mean = np.mean(elite_samples)
    std = np.std(elite_samples)

    # 打印当前的均值和标准差
    print(f"Iteration {i+1}: Mean = {mean:.4f}, Std = {std:.4f}")

    # 输出迭代结果
    best_sample = elite_samples[0]
    best_score = f(best_sample)

    print(f"Best sample after iteration {i+1}: x = {best_sample:.4f}, f(x) = {best_score:.4f}")

    # print(elite_samples)

运行结果:

Iteration 1: Mean = -1.0185, Std = 0.1761
Best sample after iteration 1: x = -1.0124, f(x) = 0.0002
Iteration 2: Mean = -0.9993, Std = 0.0225
Best sample after iteration 2: x = -1.0004, f(x) = 0.0000
Iteration 3: Mean = -0.9999, Std = 0.0044
Best sample after iteration 3: x = -0.9999, f(x) = 0.0000
Iteration 4: Mean = -1.0000, Std = 0.0011
Best sample after iteration 4: x = -1.0000, f(x) = 0.0000
Iteration 5: Mean = -1.0000, Std = 0.0001
Best sample after iteration 5: x = -1.0000, f(x) = 0.0000
Iteration 6: Mean = -1.0000, Std = 0.0000
Best sample after iteration 6: x = -1.0000, f(x) = 0.0000
Iteration 7: Mean = -1.0000, Std = 0.0000
Best sample after iteration 7: x = -1.0000, f(x) = 0.0000
Iteration 8: Mean = -1.0000, Std = 0.0000
Best sample after iteration 8: x = -1.0000, f(x) = 0.0000
Iteration 9: Mean = -1.0000, Std = 0.0000
Best sample after iteration 9: x = -1.0000, f(x) = 0.0000
Iteration 10: Mean = -1.0000, Std = 0.0000
Best sample after iteration 10: x = -1.0000, f(x) = 0.0000

即:当取x=-1.0000时,f(x) = x^2 + 2x + 1有最小值0.0000