• 参数优化
    • 穷举算法
    • 遗传算法

    参数优化

    vnpy提供2种参数优化的解决方案:穷举算法、遗传算法

    穷举算法

    穷举算法原理:

    • 输入需要优化的参数名、优化区间、优化步进,以及优化目标。
    1. def add_parameter(
    2. self, name: str, start: float, end: float = None, step: float = None
    3. ):
    4. """"""
    5. if not end and not step:
    6. self.params[name] = [start]
    7. return
    8.  
    9. if start >= end:
    10. print("参数优化起始点必须小于终止点")
    11. return
    12.  
    13. if step <= 0:
    14. print("参数优化步进必须大于0")
    15. return
    16.  
    17. value = start
    18. value_list = []
    19.  
    20. while value <= end:
    21. value_list.append(value)
    22. value += step
    23.  
    24. self.params[name] = value_list
    25.  
    26. def set_target(self, target_name: str):
    27. """"""
    28. self.target_name = target_name
    • 形成全局参数组合, 数据结构为[{key: value, key: value}, {key: value, key: value}]。
    1. def generate_setting(self):
    2. """"""
    3. keys = self.params.keys()
    4. values = self.params.values()
    5. products = list(product(*values))
    6.  
    7. settings = []
    8. for p in products:
    9. setting = dict(zip(keys, p))
    10. settings.append(setting)
    11.  
    12. return settings
    • 遍历全局中的每一个参数组合:遍历的过程即运行一次策略回测,并且返回优化目标数值;然后根据目标数值排序,输出优化结果。
    1. def run_optimization(self, optimization_setting: OptimizationSetting, output=True):
    2. """"""
    3. # Get optimization setting and target
    4. settings = optimization_setting.generate_setting()
    5. target_name = optimization_setting.target_name
    6.  
    7. if not settings:
    8. self.output("优化参数组合为空,请检查")
    9. return
    10.  
    11. if not target_name:
    12. self.output("优化目标未设置,请检查")
    13. return
    14.  
    15. # Use multiprocessing pool for running backtesting with different setting
    16. pool = multiprocessing.Pool(multiprocessing.cpu_count())
    17.  
    18. results = []
    19. for setting in settings:
    20. result = (pool.apply_async(optimize, (
    21. target_name,
    22. self.strategy_class,
    23. setting,
    24. self.vt_symbol,
    25. self.interval,
    26. self.start,
    27. self.rate,
    28. self.slippage,
    29. self.size,
    30. self.pricetick,
    31. self.capital,
    32. self.end,
    33. self.mode
    34. )))
    35. results.append(result)
    36.  
    37. pool.close()
    38. pool.join()
    39.  
    40. # Sort results and output
    41. result_values = [result.get() for result in results]
    42. result_values.sort(reverse=True, key=lambda result: result[1])
    43.  
    44. if output:
    45. for value in result_values:
    46. msg = f"参数:{value[0]}, 目标:{value[1]}"
    47. self.output(msg)
    48.  
    49. return result_values

    注意:可以使用multiprocessing库来创建多进程实现并行优化。例如:若用户计算机是2核,优化时间为原来1/2;若计算机是10核,优化时间为原来1/10。

    穷举算法操作:

    • 点击“参数优化”按钮,会弹出“优化参数配置”窗口,用于设置优化目标(如最大化夏普比率、最大化收益回撤比)和设置需要优化的参数以及优化区间,如图。https://vnpy-community.oss-cn-shanghai.aliyuncs.com/forum_experience/yazhang/cta_backtester/optimize_setting.png

    • 设置好需要优化的参数后,点击“优化参数配置”窗口下方的“确认”按钮开始进行调用CPU多核进行多进程并行优化,同时日志会输出相关信息。https://vnpy-community.oss-cn-shanghai.aliyuncs.com/forum_experience/yazhang/cta_backtester/optimize_log.png

    • 点击“优化结果”按钮可以看出优化结果,如图的参数组合是基于目标数值(夏普比率)由高到低的顺序排列的。https://vnpy-community.oss-cn-shanghai.aliyuncs.com/forum_experience/yazhang/cta_backtester/optimize_result.png

    遗传算法

    遗传算法原理:

    • 输入需要优化的参数名、优化区间、优化步进,以及优化目标;
    • 形成全局参数组合,该组合的数据结构是列表内镶嵌元组,即[[(key, value), (key, value)] , [(key, value), (key,value)]],与穷举算法的全局参数组合的数据结构不同。这样做的目的是有利于参数间进行交叉互换和变异。
    1. def generate_setting_ga(self):
    2. """"""
    3. settings_ga = []
    4. settings = self.generate_setting()
    5. for d in settings:
    6. param = [tuple(i) for i in d.items()]
    7. settings_ga.append(param)
    8. return settings_ga
    • 形成个体:调用random()函数随机从全局参数组合中获取参数。
    1. def generate_parameter():
    2. """"""
    3. return random.choice(settings)
    • 定义个体变异规则: 即发生变异时,旧的个体完全被新的个体替代。
    1. def mutate_individual(individual, indpb):
    2. """"""
    3. size = len(individual)
    4. paramlist = generate_parameter()
    5. for i in range(size):
    6. if random.random() < indpb:
    7. individual[i] = paramlist[i]
    8. return individual,
    • 定义评估函数:入参的是个体,即[(key, value), (key, value)]形式的参数组合,然后通过dict()转化成setting字典,然后运行回测,输出目标优化数值,如夏普比率、收益回撤比。(注意,修饰器@lru_cache作用是缓存计算结果,避免遇到相同的输入重复计算,大大降低运行遗传算法的时间)

    1. @lru_cache(maxsize=1000000)def _ga_optimize(parameter_values: tuple): """""" setting = dict(parameter_values)

    2. result = optimize(
    3.     ga_target_name,
    4.     ga_strategy_class,
    5.     setting,
    6.     ga_vt_symbol,
    7.     ga_interval,
    8.     ga_start,
    9.     ga_rate,
    10.     ga_slippage,
    11.     ga_size,
    12.     ga_pricetick,
    13.     ga_capital,
    14.     ga_end,
    15.     ga_mode
    16. )
    17. return (result[1],)
    18. def ga_optimize(parameter_values: list): """""" return _ga_optimize(tuple(parameter_values))

    • 运行遗传算法:调用deap库的算法引擎来运行遗传算法,其具体流程如下。1)先定义优化方向,如夏普比率最大化;2)然后随机从全局参数组合获取个体,并形成族群;3)对族群内所有个体进行评估(即运行回测),并且剔除表现不好个体;4)剩下的个体会进行交叉或者变异,通过评估和筛选后形成新的族群;(到此为止是完整的一次种群迭代过程);5)多次迭代后,种群内差异性减少,整体适应性提高,最终输出建议结果。该结果为帕累托解集,可以是1个或者多个参数组合。注意:由于用到了@lru_cache, 迭代中后期的速度回提高非常多,因为很多重复的输入都避免了再次的回测,直接在内存中查询并且返回计算结果。
    1. from deap import creator, base, tools, algorithms
    2. creator.create("FitnessMax", base.Fitness, weights=(1.0,))
    3. creator.create("Individual", list, fitness=creator.FitnessMax)
    4. ......
    5. # Set up genetic algorithem
    6. toolbox = base.Toolbox()
    7. toolbox.register("individual", tools.initIterate, creator.Individual, generate_parameter)
    8. toolbox.register("population", tools.initRepeat, list, toolbox.individual)
    9. toolbox.register("mate", tools.cxTwoPoint)
    10. toolbox.register("mutate", mutate_individual, indpb=1)
    11. toolbox.register("evaluate", ga_optimize)
    12. toolbox.register("select", tools.selNSGA2)
    13.  
    14. total_size = len(settings)
    15. pop_size = population_size # number of individuals in each generation
    16. lambda_ = pop_size # number of children to produce at each generation
    17. mu = int(pop_size * 0.8) # number of individuals to select for the next generation
    18.  
    19. cxpb = 0.95 # probability that an offspring is produced by crossover
    20. mutpb = 1 - cxpb # probability that an offspring is produced by mutation
    21. ngen = ngen_size # number of generation
    22.  
    23. pop = toolbox.population(pop_size)
    24. hof = tools.ParetoFront() # end result of pareto front
    25.  
    26. stats = tools.Statistics(lambda ind: ind.fitness.values)
    27. np.set_printoptions(suppress=True)
    28. stats.register("mean", np.mean, axis=0)
    29. stats.register("std", np.std, axis=0)
    30. stats.register("min", np.min, axis=0)
    31. stats.register("max", np.max, axis=0)
    32.  
    33. algorithms.eaMuPlusLambda(
    34. pop,
    35. toolbox,
    36. mu,
    37. lambda_,
    38. cxpb,
    39. mutpb,
    40. ngen,
    41. stats,
    42. halloffame=hof
    43. )
    44.  
    45. # Return result list
    46. results = []
    47.  
    48. for parameter_values in hof:
    49. setting = dict(parameter_values)
    50. target_value = ga_optimize(parameter_values)[0]
    51. results.append((setting, target_value, {}))
    52.  
    53. return results