基于神经网络Qlearning算法的智能车路径规划

收稿日期：2018-02-16修回日期：2018-04-11

基金项目：

国家自然科学基金资助项目（11372122）作者简介：卫玉梁（1994-），男，安徽合肥人，硕士研究生。研究方向：人工智能与机器人。

大米淀粉

通信作者：

靳伍银（1969-），男，甘肃秦安人，研究员，博士生导师。*摘要：针对智能小车行走过程中的全局路径规划和路障规避问题，提出了一种基于神经网络Q-learning 强化学习算法，采用RBF(Radial Basis Function ）网络对Q 学习算法的动作值函数进行逼近，基于MATLAB 环境开发了智能小车全局路径规划和路障规避仿真系统。与传统的以及基于势场的Q 学习算法相比，所采用的算法能更加有效地完成智能小车在行驶环境中的路径规划和路障规避。仿真结果表明：算法具有更好的收敛速度，可增强智能小车的自导航能力。对扣

关键词：路径规划，智能小车，Q-learning ，神经网络，仿真

金属防护罩中图分类号：TP242

高速车针

文献标识码：A DOI ：10.3969/j.issn.1002-0640.2019.02.010引用格式：卫玉梁，靳伍银.基于神经网络Q-learning 算法的智能车路径规划［J ］.火力与指挥控制，2019，44（2）：

46-49.基于神经网络Q-learning 算法的智能车路径规划*

卫玉梁，靳伍银*

（兰州理工大学机电工程学院，兰州730050）

Intelligent Vehicle Path Planning Based on

Neural Network Q-learning Algorithm

WEI Yu-liang ，JIN Wu-yin *

（School of Mechno-Electronic Engineering ，Lanzhou University of Technology ，Lanzhou 730050，China ）

Abstract ：A reinforcement learning algorithm based on neural network Q-learning is proposed to solve the problem of global path planning and obstacle avoidance.RBF(Radial Basis Function ）net

work is used to approximate the action value function of Q learning algorithm.The global path planning and obstacle avoidance simulation system is developed by MATLAB.Compared with the traditional and potential field Q algorithm ，the algorithm can be more effective to complete the path planning and obstacle avoidance of intelligent car in the driving environment.The simulation results show that the algorithm has better convergence speed and the ability of self navigation.Key words ：

path planning ，intelligent car ，Q-learning ，neural network ，simulation Citation format ：

WEI Y L ，JIN W Y.Intelligent vehicle path planning based on neural network Q-learning algorithm ［J ］.Fire Control &Command Control ，2019，44（2）：46-49.

0引言机器学习分为监督学习、无监督学习以及强化学习3种，其中强化学习是以环境反馈为学习策略的机器学习方法［1-2］。蒙特卡罗算法、Q 学习算法、模拟退火法、遗传算法等都属于强化学习［3］；由Watikins 提出的Q-learning 算法是强化学习算法中

应用较为广泛的一种，其特点是不依赖于环境的先

机械曝气机验模型［4-5］。因此，Q 强化学习算法是一种无模型的

在线学习算法［6］。本文采用Q 强化学习算法来解决

智能小车在行走过程中，特别是在环境中设置除起

点和目标位置以外，还有其他路障时的路径规划和

规避问题。由于Q 学习算法的量化过程会影响到最

终的实验效果，从而采用RBF 网络对Q 强化学习文章编号：1002-0640（2019）02-0046-04Vol.44，No.2

伞齿轮设计Feb ，2019

火力与指挥控制Fire Control &Command Control 第44卷第2期2019年2月*46··

本文发布于:2024-09-22 20:18:36，感谢您对本站的认可！

本文链接：https://www.17tex.com/tex/4/310963.html

上一篇：无人飞行器的飞行路径规划方法

下一篇：基于变步长蚁算法的移动机器人路径规划

标签：学习算法智能路径

留言与评论（共有 0 条评论）