节点文献

多UAV协同任务资源分配与编队轨迹优化方法研究

Research on Resources Allocation and Formation Trajectories Optimization for Multiple UAVs Cooperation Mission

【作者】 李远

【导师】 沈林成;

【作者基本信息】 国防科学技术大学 , 控制科学与工程, 2011, 博士

【摘要】 多无人机(Unmanned Aerial Vehicle, UAV)协同任务规划技术是发挥多UAV协同优势的关键之一。然而,战场环境中多UAV系统将面临信息不确定性、计算复杂性、时间紧迫性的严峻挑战。针对上述难点,论文以多平台、多任务、多目标的UAV协同对地打击任务为背景,应用闭环控制系统理论对任务规划问题进行抽象。重点围绕资源分配以及编队轨迹优化两个环节,建立数学模型,研究优化理论,设计求解算法,开展仿真与实验验证。(1)建立了部分可观条件下的多UAV协同任务动态资源分配模型。给出了多UAV对地打击中的基本任务定义,可描述不同任务的执行效果及时间、资源等属性。在部分可观的马尔可夫决策过程(Partially Observable Markov Decision Processes, POMDP)模型框架下,将待打击的目标集合视为被控系统,以打击任务为系统输入,以评估任务为状态反馈,结合资源的有限性,建立了部分可观条件下的动态系统优化模型。模型揭示了评估与打击两类任务之间的内在联系,能够实现对多UAV系统中火力资源与信息资源的统一调配,可有效反映实际作战过程中任务执行效果以及目标状态信息带有的不确定性与时间延迟,并可克服传统静态分配模型不能适应目标状态动态变化的缺陷。(2)建立了多UAV松散编队飞行的轨迹优化模型。将多UAV编队飞行过程划分为编队构成与编队保持两个阶段。基于李导数对单UAV运动学方程进行精确线性化,采用图模型描述编队内UAV平台间的通信拓扑。在平台运动学模型与通信拓扑模型的基础上,借鉴动态系统最优控制理论,构建两个阶段的优化模型。针对多UAV编队构成过程,建立了自由终端约束条件下的轨迹优化模型,可通过选择编队汇合点降低总体能量消耗。针对多UAV编队保持过程,定义了平台的个体指标与协同指标,能够兼顾减小能量消耗、跟踪参考航线、保持编队结构等多个控制目标。(3)从资源分配与编队轨迹优化模型的共性特点出发,对弱耦合动态系统分解优化方法的适应性进行拓展,并给出相应定理及理论证明。剖析了多UAV协同行为与数学模型中耦合现象之间的关联关系,引入基于凸优化理论的分解方法实现问题解耦。针对分布式计算的网络特点,构建有界时延网络的等价无时延增广网络,相应定义增广优化问题,从网络节点信息变迁矩阵的收敛性入手,理论证明了在局部连接、拓扑可变以及带有时间延迟的网络环境中,异步并行次梯度算法的一致性和收敛性。针对系统状态部分可观的弱耦合随机动态规划问题,基于带约束的POMDP值迭代公式定义对偶函数,证明子系统初始信念状态独立的条件下,对偶函数的可分离性,采用对偶分解法实现问题解耦,并给出对偶函数次梯度的构造方法。(4)提出了离线优化与在线决策相结合的动态资源分配算法。基于无限阶段POMDP建立理想条件下对单目标的动态资源指派模型,证明了最优平稳策略的性质,据此将模型转化为非线性整数规划问题,并设计求解算法。基于有限阶段POMDP建立一般条件下对单目标的动态资源指派模型,提出了改进的线性支撑算法,并证明了算法精度的可控性。将对多目标的动态资源分配过程划分为离线优化和在线决策两个阶段。离线优化阶段采用对偶分解法实现问题解耦,得到多个对单目标的POMDP子问题,并通过资源成本协调子问题最优策略的资源期望使用量。在线决策阶段综合考虑离线阶段所得资源指派策略与实际信息,基于贪婪策略确定对各目标的任务。仿真结果表明,该方法可有效克服目标状态信息不准确所带来的不利影响,能够满足实时决策的需求。(5)提出了基于协商的分布式多UAV编队轨迹优化算法。针对编队构成轨迹优化模型变量耦合,以及编队保持轨迹优化模型指标耦合的特点,分别采用原始分解法与间接分解法实现问题解耦。证明了分解后协调层主问题指标函数的性质,给出其次梯度的构造方法,并设计了基于协商的分布式轨迹优化算法。编队构成阶段中,各UAV平台独立求解固定终端状态下的最优轨迹,得到对自身有利的汇合点变化方向,更新期望的汇合点并发送至相邻平台。经过反复协商,各平台提出的汇合点趋于一致,并收敛至最优解。编队保持阶段中,各UAV平台在对偶变量的控制下独立求解自身最优轨迹,更新并向相邻平台发送对偶变量。经过反复协商,各平台在优化局部指标的同时,实现对编队几何结构的保持。仿真与飞行实验验证结果表明,该方法具有良好的计算效率与优化能力,在给定指标函数下可显著提升性能,能够有效支持多UAV编队轨迹优化设计。

【Abstract】 Mission planning for multiple Unmanned Aerial Vehicles (UAV) cooperative operation is one of the focuses to exert the advantages of cooperation of multiple UAVs. But multi-UAV systems in battle fields have to confront the uncertainty of information, the complexity of mathematical model, and the pressure of computing time. Under the background of multiple UAVs cooperative ground attack mission including multiple UAVs, multiple missions, and multiple targets, this dissertation applies closed-loop control system theory to abstracting of mission planning problem, focuses on the resource allocation problem and formation trajectory optimization problem, builds mathematical model, researches optimization theory, designs algorithm, carries out simulation and experiments.(1) The dynamic resources allocation model for multiple UAVs cooperative mission with partially observable targets states is presented. The basic tasks in multiple UAVs ground attack mission are defined, so as to describe the execution effects, time and resource properties about different tasks. Under the framework of Partially Observable Markov Decision Processes (POMDP), this paper takes the targets set to be struck as a controllable system, the strike tasks as system inputs, the assessment tasks as states feedback, and combines the finity of resources, to build a dynamic system optimization model under partially observable environment. The proposed model reveals inner relationship between strike and assessment tasks, coequally assigns fire and information resources in multiple UAVs system, and preferably describes the execution effects of tasks and information of targets states with time delay and uncertainty during operation process. Compared with traditional static models, the model overcomes the disadvantage of adapting to dynamic targets states changes.(2) The trajectory optimization model for multiple UAVs formation flying with loose structure is suggested. The process of multiple UAVs formation flying is divided into two stages, i.e. formation configuration stage and formation maintaining stage. Then, the UAV kinematics equation is linearized based on the Lie derivative, and the communication topology is described by graph model. On this basis, the optimal control theory is used for reference to build optimization models for two stages. For formation configuration stage, the trajectory optimization model with free terminal constraints is build, so as to reduce over all energy consumption by properly choosing a consensus point. For formation maintaining stage, the local and global index functions are defined respectively to achieve the control objectives about reducing energy consumption, follow reference path, and maintaining formation structure at the same time.(3) Starting from common feature of the resource allocation model and the formation trajectory optimization model, the adaptability of decomposition optimization method for weakly coupled dynamic system is expended, and corresponding theorems with theoretical proof is presented. Relation between cooperative behaviors of multiple UAVs and the coupling of mathematic model is analyzed thoroughly, then the convex optimization theory based decomposition methods are introduced to achieve the decoupling of problems. For the features about networks of distributed computing, the equivalent augmented network model without time delay of networks with time delay is build, and the augmented optimization problem is defined. Then, starting from the convergence of network nodes information transition matrix, the consistency and convergence of distributed asynchronous subgradient algorithm running in networks with local connection and time varying topology, and with time delay is theoretically proofed. For weakly coupled stochastic dynamic programming problem with partially observable system states, the dual function is defined on the basis of POMDP value iteration equation with constraints. Then, under premise of independency of each sub system’s initial belief state, the separability of dual function is theoretically proofed. Finally, the problem is decoupled by dual decomposition method, and construction method for subgradient of dual function is given.(4) The dynamic resources allocation algorithm, which combines off-line optimization with on-line decision algorithm, is presented. Dynamic resource assignment model for single target under ideal conditions is build on the basis of infinite horizon POMDP. The model is conversed into non-linear integral programming problem based on properties of optimal stationary policy, and the resolve algorithm is designed. Dynamic resource assignment model for single target under general conditions is build on the basis of finite horizon POMDP. The improved linear support algorithm is proposed, and the controllable property of solution precision of the algorithm is proofed. Resources allocation process for multiple targets is divided into two stages, i.e. off-line optimization stage and on-line decision stage. In off-line optimization stage, the dual decomposition method is used to decompose the problem into multiple POMPD sub-problems for single targets, and costs of resources are used to coordinate expectation resources consumption of sub-problem optimal policies. In on-line decision stage, policies obtained in off-line stage and practical information gathered during tasks execution were took into account, in order to decide tasks to be executed for all targets based on greedy principle. Simulation results indicate that, the proposed method overcomes the disadvantage brought by uncertainty of targets states, and meets the demand of real time decision.(5) The negotiation based distributed multiple UAVs formation trajectory optimization algorithm is presented. For variable coupling in trajectory optimization model of formation configuration stage, and index coupling in trajectory optimization model of formation maintaining stage, the primal decomposition method and indirect decomposition method is used to decouple the models respectively. The properties of master problems in coordination level obtained after decomposition of above models were proofed. On this basis, the construction methods for subgradients of index function in master problem is proposed, and the negotiation based distributed trajectory optimization algorithm is designed. In formation configuration stage, each UAV resolves its own optimal trajectory with given terminal states independently, gets alteration direction of consensus point to its advantage, and then updates and sends the point to its neighbors. After repeatedly negotiations, consensus points proposed by all UAVs consistence and convergent to the optimal solution. In formation maintaining stage, each UAV resolves its own optimal trajectory independently under control of dual variables, then updates and sends dual variables to its neighbors. After repeatedly negotiations, each UAV achieve to maintaining formation geometrical structure while optimize its local index. Simulation and experiments results indicate that, the proposed method has good computational efficiency and optimization capabilities, achieves performance improvements obviously under given index function, and supports multiple UAVs formation trajectory optimization effectively.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络