节点文献

带泊松跳跃的正倒向随机最优控制理论及其应用

Forward and Backward Stochastic Optimal Control Theory with Poisson Jumps and Its Applications

【作者】 史敬涛

【导师】 吴臻;

【作者基本信息】 山东大学 , 运筹学与控制论, 2009, 博士

【摘要】 随机最优控制是现代控制理论中的重要问题。这类问题总是要求控制者在容许控制集合中最小化/最大化某个指标泛函来满足一个状态方程(随机控制系统)。取得最小值/最大值的容许控制称为最优控制,相应的状态变量和指标泛函分别称为最优轨线和值函数。众所周知,庞特里亚金(Pontryagin)的最大值原理和贝尔曼(Bellman)的动态规划原理是解决随机最优控制问题的两种主要的和最通常使用的方法。在最大值原理的表述中,给出了最优控制满足的必要条件;这一条件总是由某个哈密顿(Hamilton)函数来给出,称为最大值条件。哈密顿函数是针对系统状态变量和某些对偶变量来定义的。对偶变量满足的方程称为对偶方程,是一个或两个巴赫杜-彭(Pardoux-Peng)型的倒向随机微分方程(BSDE)。包含对偶方程、状态方程和最大值条件的系统称作广义哈密顿系统。另一方面,动态规划原理的基本思想是考虑一族不同初始时刻和初始状态的随机最优控制问题,去建立这族问题与称作哈密顿-雅各比-贝尔曼(Hamilton-Jacobi-Belman,HJB)方程的二阶偏微分方程(PDE)之间的联系。如果HJB方程可解,则我们可以通过最大化/最小化HJB方程中的广义哈密顿函数来得到最优控制,这一结果称为随机验证定理(SVT)。这两种方法已经各自独立地取得了发展;最近的文献中存在一些关于这两种方法之间关系的研究。这篇论文旨在发展和完善随机最优控制理论,特别是带泊松(Poisson)跳跃的正倒向问题。在这类问题中,带泊松跳跃的随机微分方程(SDEP)、倒向随机微分方程(BSDEP)和正倒向随机微分方程(FBSDEP)经常出现。这类方程的解不连续,原因是这些方程中的随机干扰来自于布朗(Brown)运动和泊松随机测度。泊松随机测度是与某个跳过程联系的计数测度。具体地说,泊松随机测度度量某个不连续过程在某段时间内、跳的幅度包含于某个可测集的跳的次数。也就是说,泊松随机测度包含了某个不连续(跳)过程的所有信息:它告诉我们什么时刻跳以及跳的幅度有多大。带泊松跳跃的正倒向随机最优控制理论在工程和金融市场中有很广泛的实际应用前景。在第二章中,我们研究跳扩散过程随机最优控制问题的最大值原理与动态规划原理之间的关系,这里系统的状态过程用SDEP来描述。首先,在温和的假设条件下,我们给出了值函数的某些基本性质并且证明了动态规划原理在跳扩散框架下仍然成立。然后我们给出了相应的广义HJB方程,它现在是一个包含广义哈密顿函数的二阶偏积分-微分方程(PIDE)。其次,在假设值函数光滑(连续可微)的条件下,我们建立了最大值原理与动态规划原理之间的关系。再次,不假设值函数光滑,利用粘性解理论,我们同样得到了最大值原理与动态规划原理之间的关系。最后,首先假设值函数光滑,我们得到了一个随机验证定理,通过它我们可以最大化广义哈密顿函数来得到最优控制。在粘性解的框架下,我们还证明了不包含值函数的任何导数的随机验证定理的另一版本。非线性BSDE首先由Pardoux和Peng[74]引入。Duffie和Epstein[35]独立地从经济背景下同样引入了BSDE。在[35]中,他们给出了递归效用的一种随机微分表述。递归效用是标准的可加效用的推广,其当前效用不仅依赖于当前消费率而且依赖于未来的效用。El Karoui,Peng和Quenez[37]发现,递归效用过程可以用一个BSDE的解来表示。从BSDE观点,[37]还给出了递归效用的另外表述和性质。从而,随机最优控制问题,如果其指标泛函由某个BSDE的解来描述,则构成了随机递归最优控制问题。在第三章中,我们考虑一类带泊松跳跃的随机递归最优控制问题,其指标泛函由某个BSDEP的解来描述。对这一问题,应用Peng[79]中的随机后向半群的概念,Li和Peng[59]最近得到了相应的动态规划原理,并且证明了值函数是某个广义HJB方程的粘性解。我们则研究这一随机递归最优控制问题的最大值原理与动态规划原理之间的关系。为此,我们首先证明了一类带泊松跳跃的正倒向随机控制系统的局部最大值原理。并且,我们证明了加上某些凸/凹性假设条件,上述最大值原理也是充分的。我们还讨论了这一结果在金融市场中一类均值-方差投资组合选择混合一个递归效用泛函的优化问题中的应用。然后,假设值函数光滑,我们得到了相应的随机最大值原理与动态规划原理之间的关系。作为应用,我们讨论了金融市场中一类线性二次(LQ)递归投资组合优化问题。在这一例子中,利用最大值原理和动态规划原理都得到了同样的最优控制,二者的关系也得到了验证。LQ随机最优控制问题是随机最优控制问题中最重要的例子,特别是由于其优良的结构和在工程设计中的广泛应用。在第四章中,我们研究一类带泊松跳跃的耦合正倒向LQ随机最优控制问题,在金融市场中当考虑“大户投资者”时会碰到这类最优控制问题。我们证明了存在惟一的最优控制并给出了其状态反馈形式。当所有系数是确定性的时候,利用一类广义矩阵值黎卡提(Riccati)方程系统的解,我们得到了最优控制的线性状态反馈调节器。我们还讨论了这类黎卡提方程的可解性。系数受连续时间马尔科夫(Markov)链调节的随机微分方程(SDE)来自于金融市场中为反映更现实的随机市场环境而出现的体制转换模型。在体制转换模型中,市场参数依赖于在有限个状态之间转换的市场模式。不同的市场模式可以反映潜在的市场状态、投资者的心情以及其他经济因素。最近,博士论文[97]中引入了带马尔科夫链的BSDE,其生成元受随机干扰并且用一个连续时间马尔科夫链来描述。受一个带马尔科夫链调节的带泊松跳跃的LQ随机最优控制问题的驱使,在第五章中,我们推广[97]中的部分结果至不连续情形。也就是说,我们考虑带马尔科夫链的BSDEP。在假设生成元满足全局李普希兹(Lipschitz)条件下,利用某些推广的鞅表示定理,我们得到了其解的存在惟一性结果。我们还讨论了解过程的性质,得到了一维情形下的比较定理。这篇论文的另一个目的是研究部分可观测的完全耦合正倒向随机最优控制问题。部分可观测的最优控制问题的最重要的特征之一是其有更实际的背景。具体地说,实际上控制者不能完全观测到系统状态,在大多数情况下只能观测到与系统状态相关的某个噪声过程。最近,很多研究兴趣已经被吸引到完全耦合的正倒向随机控制系统上来。一个原因是理论本身是有趣的并富有挑战性。另一方面,在金融市场中,当考虑“大户投资者”的投资组合优化问题时会碰到这类控制系统。这时的状态过程用完全耦合的正倒向随机微分方程(FBSDE)来描述。在第六章中,假设控制域可能非凸,利用针状变分、对偶和滤波技术,我们得到了一类部分可观测的完全耦合正倒向随机控制系统的最大值原理。为了解释理论结果,我们给了一个例子讨论部分可观测的完全耦合LQ正倒向随机最优控制问题。结合经典的滤波技术和求解线性FBSDE的技术,我们得到了可观测的最优控制。同时,我们还得到了最优轨线的滤波估计,它们由双倍维数的正倒向常微分方程(DFBODE)和若干黎卡提方程的解来给出。最后,结合前面的技术和艾克兰(Ekeland)变分原理,我们还讨论了带状态约束的问题。论文共分六章,以下是本文结构和得到的主要结论。第一章:介绍从第二章到第六章我们研究的问题。第二章:建立跳扩散过程随机最优控制问题的最大值原理与动态规划原理之间的关系。我们考虑下面的随机控制系统和指标泛函跳扩散过程的随机最优控制问题叙述如下。问题(JD)s,y。对给定(s,y)∈[0,T)×Rn,在U[s,T]中最小化(2.2),满足(2.1)。我们的主要结果是下面的针对光滑值函数情形的定理2.4和针对非光滑值函数情形的定理2.8。定理2.4.(关系,光滑情形)假设(H2.1)~(H2.3)成立,(s,y)∈[0, T)×Rn,给定。设(?)是问题(JD)s,y的最优对,(?)是一阶对偶方程(2.19)的解。假设(?),则其中G由(2,16)定义。进一步地,如果(?)以及Vtx也连续,则定理2.8. (关系,非光滑情形)假设(H2.1)~(H2.3)成立,(?)给定。设(?)满足(2.8)和(2.9)是广义HJB方程(2.15)的粘性解,(?)是问题(JD)s,y的最优对,(?)和(?)分别是一阶和二阶对偶方程(2.19),(2.20)的解。则其中G由(2.54)定义。下面的两个结果分别给出了光滑和不光滑值函数情形下的随机验证定理。定理2.9.(随机验证定理,光滑情形)假设(H2.1)~(H2.3)成立,(?)是广义HJB方程(2.15)的解,则进一步地,如果容许对(?)满足其中G由(2.16)定义,则(?)是最优对。定理2.10. (随机验证定理,非光滑情形)假设(H2.1),(H2.2)成立。设(?)满足(2.8)和(2.9)是广义HJB方程(2.15)的粘性解,则(i)(2.73)成立;(ii)设(?)给定,(?)是容许对。假设存在(?),使得以及其中(?),满足(?),则(?)是最优对。第三章:建立带泊松跳跃的随机递归最优控制问题的最大值原理与动态规划原理之间的关系。作为准备工作,我们首先考虑下面的正倒向随机控制系统和指标泛函正倒向随机最优控制问题叙述如下。问题(FB)0,T。对给定(?),在Uad中最小化(3.2),满足(3.1)。应用经典的凸变分方法,我们首先得到了下面的局部最大值原理。定理3.1.(局部随机最大值原理)假设(H2.1),(H2.3)’,(H3.1)和(H3.2)成立。设u(·)是问题(FB)0,T的最优控制,(?)是相应的最优轨线,则其中哈密顿函数H由(3,7)定义。进一步地,在某些附加的凸/凹性假设下,上面定理3.1中的必要条件也是充分的。定理3.2.(最优控制的充分性条件)假设(H2.1),(H2.3)’,(H3.1)~(H3.3)成立。设u(··)是容许控制,(?)是相应的轨线,满足(?)是对偶方程(3.6)的解。假设H关于(?)凸,则u(·)是问题(FB)0,T满足(3.8)的最优控制。在上述工作的基础上,我们研究带泊松跳跃的随机递归最优控制问题的最大值原理与动态规划原理之间的关系。我们考虑下面的随机控制系统和指标泛函随机递归最优控制问题叙述如下。问题(R)s,y.对给定(?),在U[s,T]中最小化(3.31),满足(3.35)。主要结果是下面的定理。定理3.6.(关系,递归问题,光滑情形)假设(H2.1),(H2.3)’,(H3.1),(H3.2)成立,(?)给定。设U(·)是问题(R)s,y的最优控制,(?)是相应的最优轨线,(?)是对偶方程(3.36)的解。假设值函数(?),则进一步地,如果(?)并且Vtx也连续,则第四章:研究一类带泊松跳跃的耦合正倒向LQ随机最优控制问题。我们考虑下面的随机控制系统和指标泛函LQ随机最优控制问题叙述如下。问题(LQ)0,T。对给定(?),在Uad中最小化(4.6),满足(4.5)。我们证明了存在惟一的最优控制,并给出其显式的线性状态反馈形式。定理4.1.问题(LQ)0,T存在惟一的最优控制其中(?)是相应的最优轨线。当所有的系数矩阵是确定性的时候,利用一类广义矩阵值黎卡提方程系统的解,我们可以给出最优控制的线性状态反馈调节器。定理4.2.假设t∈[0,T],存在矩阵(K(t),M(t),Y(t,·))满足广义矩阵值黎卡提方程系统(4.9),则问题(LQ)0,T的最优线性状态反馈调节器为最优值函数为我们讨论了这类广义矩阵值黎卡提方程系统的可解性问题。在某些特殊情形下,我们得到了下面的存在惟一性结果。定理4.5.假设(H4.3)成立,D≡0,则广义矩阵值黎卡提方程系统(4.9)存在惟一解(?)。第五章:研究带泊松跳跃和马尔科夫链的BSDE。首先,作为研究动机,我们讨论一类带泊松跳跃和马尔科夫链的LQ随机最优控制问题。我们考虑下面的随机控制系统和指标泛函其中(?)是一个连续时间马尔科夫链,状态空间为(?)。α的转移概率为其中qij≥0,对(?)。带马尔科夫链的LQ随机最优控制问题叙述如下。问题(LQMC)0,T。对给定(?),在Uad中最小化(5.2),满足(5.1)。通过一个约束随机黎卡提方程的解,我们得到了最优状态反馈控制和最优值函数。定理5.1.如果约束随机黎卡提方程(5.4)的解(?)的解存在,则问题(LQMC)0,T是适定的,最优状态反馈控制为(省略某些时间变量t)进一步地,最优值函数为我们研究下面带泊松跳跃和马尔科夫链的BSDE:在生成元满足全局李普希兹条件的假设下,利用一些推广的鞅表示定理,我们得到了其解的存在惟一性结果。定理5.2.(存在惟一性)假设(H5.1)成立,则BSDEP (5.8)存在惟一解(?)。我们还讨论了解的若干性质,在一维情形下证明了比较定理。为此,设(?)是另一个马尔科夫链,状态空间(?)。β的转移概率为其中qjk≥0,对(?)。定理5.4. (比较定理)假设(?)满足(H5.2)。设过程(?)可测,满足(?)。令对(?)。设(?),f’定义为对(?),其中(?)满足(H5.2)。我们用(Y,Z,K(·))(相应地,(Y’,Z’,K’(·)))记BSDEP (5.8)对应于参数(ξ,f)(相应地,(ξ’,f’))的解。如果(iv)ξ≥ξ’,a.s.;(v)对马尔科夫链α,β,成立(?),a.s.;(vi)(?)关于i∈Μ单调不减,并且(?),a.s.,a.e.,(?),则我们有如果,进一步地,我们假设P(ξ>ξ’)>0,则(?)。特别地,Y(0)>Y’(0)。第六章:研究一类部分可观测的完全耦合正倒向随机最优控制问题。我们考虑下面的随机控制系统带有观测方程和指标泛函部分可观测的随机最优控制问题叙述如下。问题(PO)0,T.对给定(?),在Uad中最小化(6.7),满足(6.4)和(6.5)。我们的主要结果是下面的定理。定理6.1.(部分可观测的随机最大值原理)假设(H6.1)~(H6.3)成立。设u(·)是问题(PO)0,T部分可观测的最优控制,(?)是相应的最优轨线,Z(·)是相应的(6.6)的解。设(P(·),Q(··))是附属BSDE(6.34)的解,(p(··),q(·),k(·))是对偶FBSDE(6.35)的解,则其中哈密顿函数H由(6.36)定义。为了解释理论结果,我们给出了一个部分可观测的完全耦合LQ正倒向随机最优控制问题。我们考虑下面的随机控制系统和观测方程指标泛函为部分可观测的LQ随机最优控制问题叙述如下。问题(POLQ)0,T.对给定(?),在Uad中最小化(6.40),满足(6.38)和(6.39).结合经典的线性滤波理论和求解线性FBSDE的技术,我们得到了显式的满足必要条件的可观测最优控制。同时,我们得到了最优轨线的滤波估计,它们由一个双倍维数的正倒向常微分方程(DFBODE)和若干黎卡提方程的解给出。定理6.2.(LQ情形,可观测最优控制及最优轨线的滤波估计)对问题(POLQ)0,T,可观测的最优控制u(·)由(6.47)给出,其中(?)是DFBODE (6.53)的解,(?)由(6.51)给出,Π(·)是黎卡提方程(6.44)的解。并且,最优轨线的滤波估计(?)分别由DFBODE(6.53)和(6.57)的解给出,其中∑(·)是黎卡提方程(6.55)的解。最后,我们讨论带状态约束的问题。我们考虑如下的状态约束。带状态约束的部分可观测随机最优控制问题叙述如下。问题(POC)0,T.对给定(?),在Uad中最小化(6.7),满足(6.4),(6.5)以及状态约束(6.57)。主要结果是下面的定理。定理6.3.(状态约束下部分可观测的随机最大值原理)假设(H6.1)~(H6.4)成立。设u(·)是问题(POC)0,T部分可观测的最优控制,(?)是最优轨线,Z(·)是相应的(6.6)的解。则存在非零三元组(?),满足(?),以及(?)分别是附属BSDE(6.60)和对偶FBSDE(6.61)的解,使得最大值条件成立,其中哈密顿函数H由(6.59)定义。

【Abstract】 Stochastic optimal control problem is important in modern control theory. Thiskind problem always asks the controllers to minimize/maximize some cost functionaland subject to a state equation (stochastic control system) over the admissible controlset. An admissible control is called optimal if it achieves the infimum/supremum of thecost functional and the corresponding state variable and the cost functional are calledthe optimal trajectory and the value function, respectively. It is well-known that Pontryagin’s maximum principle and Bellman’s dynamic programming are the two principaland most commonly used approaches in solving stochastic optimal control problems. Inthe statement of maximum principle, the necessary condition of optimality is given. Thiscondition is called the maximum condition which is always given by some Hamiltonianfunction. The Hamiltonian function is denned with respect to the system state variableand some adjoint variables. The equation that the adjoint variables satisfy is calledadjoint equation, which is one or two backward stochastic differential equations (BSDEsfor short) of Pardoux-Peng’s type. The system which consists of the adjoint equation,the original state equation, and the maximum condition is referred to as a generalizedHamiltonian system. On the other hand, the basic idea of dynamic programming principle is to consider a family of stochastic optimal control problems with different initialtime and states and establish relationships among these problems via the so-called HJBequation, which is a nonlinear second-order partial differential equation (PDE for short).If the HJB equation is solvable, we can obtain an stochastic optimal control by takingthe maximizer/miminizer of the generalized Hamiltonian function involved in the HJBequation. This is called stochastic verification theorem (SVT for short). Both of thesetwo approaches have been developed separately and independently, and recently thereare some researches on the relationship between these two approaches in literatures. The main objective of this thesis is to improve and develop the stochastic optimalcontrol theory, especially for forward-backward problems with Poisson jumps. Stochastic differential equations with Poisson jumps (SDEPs for short), backward stochasticdifferential equations with Poisson jumps (BSDEPs for short) and forward-backwardstochastic differential equations with Poisson jumps (FBSDEPs for short) are usuallyinvolved in this kind of problem. The solutions to this kind equations are discontinuous,since the random disturbance in these equations comes from both Brownian motions andPoisson random measures. The Poisson random measure can be described as the counting measure associated with a jump process. More precisely, Poisson random measurecounts the number of jumps of some discontinuous process occurring between a timeinterval whose amplitude belongs to a measurable set. That is to say, the Poisson random measure contains all information about some discontinuous (jump) process: it tellsus when the jumps occur and how big they are. The forward and backward stochasticoptimal control theory with Poisson jumps has wide practical applications in engineeringand financial market.In Chapter 2, we investigate the relationship between maximum principle (MP forshort) and dynamic programming principle (DPP for short) for stochastic optimal controlproblem of jump diffusions. Here, the system state process is described by a controlledSDEP. Firstly, on some mild assumptions we give some basic properties of the valuefunction and prove that the DPP still holds in our jump diffusion setting. Then we givethe corresponding generalized HJB equation which is a second-order partial integraldifferential equation (PIDE for short) containing the generalized Hamiltonian functionnow. Secondly, on the assumption that the value function is smooth, we establish therelationship between stochastic MP and DPP. Thirdly, using the theory of viscositysolutions, we also obtain the relationship between stochastic MP and DPP withoutassuming the smoothness of the value function. Finally, an SVT is at first derivedon the assumption that the value function is smooth from which we can obtain theoptimal control by maximizing the generalized Hamiltonian function. Then anotherversion of SVT is proved without involving any derivatives of the value function withinthe framework of viscosity solutions.Nonlinear BSDE was introduced by Pardoux and Peng[74] firstly. Independently,Duffie and Epstein[35] also introduced BSDE from economic background. In [35], theypresented a stochastic differential formulation of recursive utility. Recursive utility is anextension of the standard additive utility with the instantaneous utility depending not only on the instantaneous consumption rate but also on the future utility. As found by ElKaroui, Peng and Quenez[37], the utility process can be regarded as a solution to a specialBSDE. From BSDE’s point of view, [37] also gave formulations of recursive utility andtheir properties. Thus, the problem whose cost functional of a control system is describedby the solution to a BSDE, becomes a stochastic recursive optimal control problem. InChapter 3, we consider one kind of stochastic recursive optimal control problem withPoisson jumps, where the cost functional of the control system is described by the solutionto a BSDEP. For this problem, using the notion of stochastic backward groups introducedin Peng[79], Li and Peng[59] recently have obtained the corresponding DPP and provedthat the value function is a viscosity solution to some generalized HJB equation. We theninvestigate the relationship between MP and DPP for such problem. For this purpose, wefirstly prove a local MP for the forward-backward stochastic control system with Poissonjumps. Moreover, we prove that under some additional convexity/concavity conditions,the above MP is also sufficient. Applications of our result to a mean-variance portfolioselection mixed with a recursive utility functional optimization problem in the financialmarket is discussed. Then, on the assumption that the value function is enough smooth,we obtain the relationship between stochastic MP and DPP. As applications, we givean example of linear quadratic (LQ for short) recursive portfolio optimization problemin the financial market. In this example, the optimal control in state feedback form isobtained by both the stochastic MP and DPP, and the relations we obtained are verified.The LQ stochastic optimal control problems are the most important examples ofstochastic optimal control problems, especially due to their nice structures and wideapplications in engineering design. In Chapter 4, we study one kind of coupled forward-backward LQ stochastic optimal control problem with Poisson jumps. Such kind ofoptimal control problems can be encountered in the financial market when we considerthe "large investor". We prove that there exists a unique optimal control and give theexplicit linear state feedback form. When all the coefficient matrices are deterministicwe can give the linear state feedback regulator for the optimal control using the solutionto one kind of generalized matrix-valued Riccati equation system. And the solvability ofthis kind Riccati equation system is discussed.SDE whose coefficients are modulated by a continuous-time Markov chain stemsfrom the regime switching models in financial market for the need of more realistic models that better reflect random market environment. In the regime switching model, themarket parameters depend on the market mode that switches among a finite number of states. The market mode could reflect the state of the underlying marketplace and thegeneral mood of investors, and other economic factors. In her doctoral thesis, Tang[97]recently introduced BSDEs with Markov chains whose generator are disturbed by random environment and described by a continuous-time Markov chains. Motivated byan LQ stochastic optimal control problem for Markov-modulated control system withPoisson jumps, in Chapter 5, we generalized part results of Tang[97] to discontinuouscase. That is to say, we consider BSDEPs with Markov chains. On the assumptionthat the generator satisfies the global Lipschitz condition, we obtain the existence anduniqueness of solutions to them, by virtue of some extended martingale representationtheorems. Some properties of these solution processes and a comparison theorem in theone-dimensional case are obtained.Another objective of this thesis is to study the partially observed fully coupledforward-backward stochastic optimal control problems. One of the most important characteristics for partially observed optimal control problems is that it has more practicalbackground. Specifically, controllers can not fully observe the system states in reality.In most cases they can only observe some noise process related to the system. Recently,more researching attentions have been attracted by optimal control problems of fully coupled forward-backward stochastic systems. One reason is that the theory is interestingand challenging in itself. Another is that these kinds of systems are usually encounteredwhen we study some financial optimization problems for some "large investors". In thiscase, state processes are described as fully coupled FBSDEs. In Chapter 6, on the assumption that the the control domain is possibly not convex, we obtain a stochasticMP for one kind of partially observed fully coupled forward-backward stochastic optimalcontrol problem by spake variational, duality and filtering techniques. To illustrate ourtheoretical result, we give an example for a partially observed fully coupled LQ forwardbackward stochastic optimal control problem. Combining the classical linear filteringtheory with the technique of solving linear FBSDEs, we find an explicit observable optimal control. Meanwhile, we obtain the filtering estimates of the optimal trajectorieswhich are given by the solutions to some forward-backward ordinary differential equationwith double dimensions (DFBODE for short) and Riccati equations. Finally, problemwith state constraints is discussed by combining Ekeland’s variational principle with thetechnique presented above.This thesis consists of six chapters. In the following, we list the main results.Chapter 1: We introduce problems studied from Chapter 2 to Chapter 6. Chapter 2: We establish the relationship between maximum principle and dynamicprogramming principle for the stochastic optimal control problem of jump diffusions. Weconsider the following stochastic control systemand the cost functional isThe stochastic optimal control problem of jump diffusions is the following.Problem (JD)s,y. For given (s,y)∈[0, T)×Rn, minimize (2.2) subject to (2.1) overU[s,T].The main results are the following Theorem 2.4 for smooth value function andTheorem 2.8 for nonsmooth value function.Theorem 2.4. (Relationship, Smooth Case) Suppose (H2.1)~(H2.3) hold and let(s,y)∈[0,T)×Rn be fixed. Let (?) be an optimal pair for Problem (JD)s,yand (?) the solution to the first-order adjoint equation (2.19). Suppose thevalue function V∈(?). Thenwhere G is defined by (2.16). Further, if V∈(?) and Vtx is also continuous,thenTheorem 2.8. (Relationship, Nonsmooth Case) Suppose (H2.1)~(H2.3) holdand let (s, y)∈[0,T)×Rn be fixed. Let V∈(?), satisfying (2.8) and (2.9), be a viscosity solution to the generalized HJB equation (2.15), (?) be the optimalpair for Problem (JD)s,y. Let (?) and (?) are solutions tothe first- and second-order adjoint equations (2.19), (2.20), respectively. Then we havewhere G-function is defined by (2.54).Moreover, the following two results give the stochastic verification theorems forsmooth and nonsmooth value function, respectively.Theorem 2.9. (SVT, Smooth Case) Suppose (H2.1)~(H2.3) hold. Let V∈(?) be a solution to the generalized HJB equation (2.15). ThenMoreover, suppose a given admissible pair (?) satisfieswhere G is defined by (2.16). Then (?) is an optimal pair.Theorem 2.10. (SVT, Nonsmooth Case) Suppose (H2.1), (H2.2) hold. Let V∈(?), satisfying (2.8) and (2.9), be a viscosity solution to the generalized HJBequation (2.15). Then we have the following.(i) (2.73) holds.(ii) Let (s,y)∈[0,T)×Rn be fixed and let (?) be an admissible pair.Suppose there exists (?),such thatand where (?), such that (?).Then (?) is an optimal pair.Chapter 3: We establish the relationship between maximum principle and dynamicprogramming principle for the stochastic recursive optimal control problem with Poissonjumps. As preliminaries, we firstly consider the following forward-backward stochasticcontrol systemand the cost functional isThe forward-backward stochastic optimal control problem is the following.Problem (FB)0,T. For given x0∈Rn, minimize (3.2) subject to (3.1) over Uad.Using convex variational method, we first prove a local maximum principle.Theorem 3.1. (Local Stochastic Maximum Principle) Suppose (H2.1), (H2.3)’,(H3.1) and (H3.2) hold. Let u(·) be an optimal control for Problem (FB)0,T, and(?) be the corresponding optimal trajectory. Then we havewhere the Hamiltonian function H is defined by (3.7).Moreover, under some additional convexity/concavity conditions, the above necessary condition in Theorem 3.1 is also sufficient.Theorem 3.2. (Sufficient Condition for Optimality) Suppose (H2.1), (H2.3)’,(H3.1)~(H3.3) hold. Let u(·) be an admissible control and (?) be thecorresponding trajectory with y(T) = (?). Let (?)be the solution to adjoint equation (3.6). Suppose that H is convex with respect to (?). Then u(·) is an optimal control for Problem (FB)0,T if it satisfies(3.8).Then we investigate the relationship between maximum principle and dynamic programming principle for stochastic recursive optimal control problem with Poisson jumps.We consider the following stochastic control systemand the cost functional isThe stochastic recursive optimal control problem is the following.Problem (R)s,y. For given (s,y)∈[0,T)×Rn, minimize (3.31) subject to (3.35) overU[s,T].The main result is the following.Theorem 3.6. (Relationship, Recursive Problem, Smooth Case) Suppose (H2.1),(H2.3)’, (H3.1), (H3.2) hold and (?) be fixed. Let u(·) be an optimal control for Problem (R)s,y, and (?) be the corresponding optimal trajectories. Let (?) be the solutions to adjointequation (3.36). Suppose that the value function (?). Then Further, if (?) and Vtx is also continuous, thenChapter 4: We study one kind of coupled forward-backward LQ stochastic optimalcontrol problem with Poisson jumps. We consider the following stochastic control systemand the cost functional isThe LQ stochastic optimal control problem is the following.Problem (LQ)0,T. For given x0∈Rn, minimize (4.6) subject to (4.5) over Uad.We prove that there exists a unique optimal control and give the explicit linear statefeedback form.Theorem 4.1. There exists a unique optimal control for Problem (LQ)0,T: where (?) is the corresponding optimal trajectory.When all the coefficient matrices are deterministic, we can give the linear statefeedback regulator for the optimal control using the solution to one kind of generalizedmatrix-valued Riccati equation system.Theorem 4.2. Suppose for all t∈[0,T], there exist matrices (?)satisfying the generalized matrix-valued Riccati equation system (4.9). Then the optimallinear state feedback regulator for Problem (LQ)0,T isand the optimal value function isThe solvability of this kind of generalized matrix-valued Riccati equation system isdiscussed. In some special case, we obtain the following existence and uniqueness result.Theorem 4.5. Suppose (H4.3) holds and D≡0. Then the generalized matrix-valuedRiccati equation system (4.9) has a unique solution (?).Chapter 5: We study BSDEPs with Markov chains. As motivation, firstly wediscuss an LQ stochastic optimal control problem with Poisson jumps and Markov chains.We consider the following stochastic control systemand the cost functional iswhere (?) is a continuous-time Markov chain with the state space(?).αhas the transition probabilities given by where qij≥0 for i≠j and qij = (?).The LQ stochastic optimal control problem with Markov chains is the following.Problem (LQMC)0,T. For given x0∈Rn, minimize (5.2) subject to (5.1) over Uad.An optimal state feedback control and the value function is obtained via the solutionto a constrained stochastic Riccati equation.Theorem 5.1. If the constrained stochastic Riccati equation (5.4) admits a solutiontriple (?), then Problem(LQMC)0,T is well-posed. Moreover, the optimal state feedback control is (with the timeargument t suppressed)Furthermore, the value function isMotivated by this kind of stochastic Riccati equations, we study the following BS-DEP with Markov chains:On the assumption that the generator satisfies the global Lipschitz condition, weobtain the existence and uniqueness result of the solution, by virtue of some extendedmartingale representation theorems.Theorem 5.2. (Existence and Uniqueness of solution to BSDEP with MarkovChains) Suppose (H5.1) holds. Then BSDEP (5.8) admits a unique solution tripleSome properties of these solution processes are obtained and a comparison theoremin the one-dimensional case is proved. For this target, let (?) beanother continuous-time Markov chain with the state space (?) has the transition probabilities given bywhere qjk≥0 for j≠k and (?).Theorem 5.4. (Comparison Theorem) Suppose n =1. Let (?)satisfy (H5.2). Furthermore, let process (?) bemeasurable and satisfy 0≤l(t, e)≤C(1∧|e|), e∈E. We setfor all (ω, t, y, z,φ,i)(?).Let (?) and f’ be defined asfor all (?), where (?)satisfies (H5.2).We denote by (Y,Z,K(·)) (respectively, (Y’,Z’,K’(·))) the unique solution triple toBSDEP (5.8) with the data (ξ,f) (respectively, (ξ’,f’)). Then, if(iv)ξ≥ξ’, a.s.;(v) for Markov chains (?);(vi) f(t,y,z,k(·),i) is nondecreasing with respect to i∈Μand (?), a.s.,a.e., (?), it follows thatAnd if, in addition, we also assume that P(ξ>ξ’) > 0, then P(Y(t) > Y’(t)) > 0, (?)t∈[0,T]. In particular, Y(0) > Y’(0).Chapter 6: We study one kind of partially observed fully coupled forward-backwardstochastic optimal control problem. We consider the following stochastic control systemwith the observation and the cost functional isThe partially observed stochastic optimal control problem is the following.Problem (PO)0,T. For given x0∈Rn, minimize (6.7) subject to (6.4) and (6.5) overUad.Our main result is the following.Theorem 6.1. (Partially Observed Stochastic Maximum Principle) Suppose(H6.1)~(H6.3) hold. Let u(·) be an partially observed optimal control for Problem(PO)0,T, (?) be the optimal trajectory and Z(·) be the corresponding solutionto (6.6). Let (P(·),Q(?)) be the solution to auxiliary BSDE (6.34) and (p(·),q(·),k(·))be the solution to adjoint FBSDE (6.35). Then we havewhere the Hamiltonian function H is defined by (6.36).To illustrate our theoretical result, we give an example for a partially observedfully coupled LQ forward-backward stochastic optimal control problem. We consider thefollowing stochastic control systemand observationThe cost functional isThe partially observed LQ stochastic optimal control problem is the following.Problem (POLQ)0,T. For given x0∈Rn, minimize (6.40) subject to (6.38) and (6.39)over Uad.Combining the classical linear filtering theory with the technique of solving linear FBSDEs, we find an explicit observable optimal control. In addition, we obtain the filtering estimates of the optimal trajectories which are given by the solutions toa forward-backward ordinary differential equation with double dimensions (DFBODE)and Riccati equations.Theorem 6.2. (LQ case, Observable Optimal Control and Filtering Estimatesof Optimal Trajectories) For Problem (POLQ)0,T, an observable optimal controlu(·) is given by (6.47), where (?) are solutions to DFBODE (6.53) andΠ(·) is thesolution to Riccati equation (6.44), where (?) are given by (6.51).Moreover, the filtering estimates of optimal trajectories (?) are given byDFBODE (6.53) and (6.57), respectively, whereΣ(·) is the solution to Riccati equation(6.55).Finally, we consider problem with state constraints. We introduce the followingstate constraints.The partially observed stochastic optimal control problem with state constraints isthe following.Problem (POC)0,T. For given x0∈Rn, minimize (6.7) subject to (6.4) and (6.5)under constraints (6.57) over Uad.The main result is the following.Theorem 6.3. (Partially Observed Stochastic Maximum Principle with StateConstraints) Suppose (H6.1)~(H6.4) hold. Let u(·) be an partially observed optimalcontrol for Problem (POC)0,T, (?) the optimal trajectory and Z(·) thecorresponding solution to (6.6). Then there exists a nonzero triple (?)with (?), and (?) which are solutionsto the adjoint BSDE (6.60) and FBSDE (6.61), respectively, such that the followingmaximum condition holds:where the Hamiltonian function H is defined by (6.59).

  • 【网络出版投稿人】 山东大学
  • 【网络出版年期】2010年 12期
节点文献中: