节点文献

线性约束矩阵最小二乘问题:理论与算法

Linearly Constrained Matrix Least Squares Problems: Theory and Algorithms

【作者】 裘渔洋

【导师】 张振跃;

【作者基本信息】 浙江大学 , 计算数学, 2007, 博士

【摘要】 近年来,随着社会的发展和科学的进步,并受其他学科与工程技术领域应用中产生的迫切需求所驱动,带各种约束的线性方程及其对应的最小二乘问题越来越引起人们的关注。如在众多领域中拥有广泛应用的Sylvester方程,Lvapunov方程和带各种约束的“逆特征值问题”及其最小二乘问题等。理论上,利用Kronecker积,可以等价地将矩阵形式的约束最小二乘问题写成向量形式,因此,通解能够用高维空间的向量表示。由于约束方程可以视为对应的最小二乘问题残量为零时的特殊情形,所以这种表示也同样适合约束方程。但是,这种从矩阵到向量的等价变换方式引起系数阵的不必要增大,将直接导致计算量和存储量的成倍增加;同时,系数矩阵的一些特殊结构在解的表示中可能会丢失。为了克服在结构分析和通解表示中引入Kronecker积产生的各种困难,许多研究者更为感兴趣地是从矩阵层面上寻求无约束或者线性约束最小二乘问题的通解表示。常用的方法是借助于一些矩阵分解,如广义奇异值分解(GSVD),典型相关分解(CCD),和商奇异值分解(QSVD)等。一般而言,矩阵方程可以通过特殊分解简化为“对角”形式,由此产生的新系统是容易求解的:我们只需将新变量阵合理的进行分块即可,其中的子块一部分可以直接求解,另一部分则通过求解一个小规模的线性系统实现。此外,有些研究者也通过迭代数值求解一些特殊的约束最小二乘问题。一般地,矩阵分解是对特殊问题采取的特定解法,需要很强的技巧性;并且它也依赖于方程本身和约束条件。因此,很难将一个特定的分解直接应用于一系列相关的却又不同的问题,可“移植”性较差。此外,这些方法通常是不“鲁棒”的,微小的改变约束条件可能会引起求解的巨大变化。一个自然的问题是:对于一般的线性约束最小二乘问题,是否在通解和迭代上存在着统一的求解模式。在本文中,我们试图在理论和算法上给出框架式的研究方法,通过对老问题从新的切入点和视角入手进行研究。我们相信文中提出的方法对约束最小二乘问题的求解能够提供启发和帮助。本文的主要贡献如下:1.利用约束空间的基矩阵(约束矩阵),我们用一个低维的列向量空间(坐标空间)刻划该线性约束空间,并且给出一般矩阵到这个约束空间上正交投影的矩阵表示。2.我们揭示了强约束最小二乘问题和弱约束最小二乘问题解空间之间的内在联系。强约束解集是否包含于弱约束解集,取决于方程的右端矩阵T。对于一般情形,我们证明了:存在一个T的等价映射φ(T),使得关于右端为φ(T)的强约束最小二乘问题与原强约束问题同解。更为重要的是:以φ(T)为右端的弱约束问题的解集中,包含了原强约束问题的所有解。因此,对给定的右端φ(T),如果相应的弱约束问题的通解已知或者容易得到,那么,原强约束问题也就容易求解。我们用定理3.5揭露上述潜在的性质,同时,该定理也表明:强约束解集实际上就是强约束空间与以φ(T)为右端的弱约束解集的交集。因此,等价映射的获得就成为了求解约束最小二乘问题的关键。T的等价映射φ(T)是不唯一的,它可以取成是在强约束变换值域(约束问题本身的线性变换关于强约束空间的值域)上的正交投影。在文中,我们提出了刻划等价映射的几个定理,并且给出了等价映射可验证的一些构造途径。然后,将这些理论应用于一些特殊的约束最小二乘问题,得到了通解的简单表示形式。从弱约束解集中寻找强约束通解为一般的带约束最小二乘问题提供了一种统一的求解模式,特别适合逐步施加更多约束条件的矩阵最小二乘问题的求解。3对于特殊约束的两类特定方程,通过约束条件中矩阵的特征值分解,我们实现约束方程的无约束化;同时将原约束方程等价于一个通解已知的无约束方程;然后重构出原约束方程的解。在通解的表示中不涉及具体的特征值分解。4.我们提出了求解线性约束最小二乘问题的几种矩阵形式Krylov子空间方法。本质上,这些算法是向量形式的CGLS,GMRES和LSQR对应的矩阵迭代,但不仅仅是它们简单的重写。理论上,借助于基矩阵和Kronecker积,我们得到约束最小二乘问题的无约束向量形式,并对其应用向量形式的Krylov子空间方法,由此可以推导出矩阵形式的Krylov方法。但是在具体的矩阵迭代中,我们要求是不涉及Kronecker积和基矩阵的。因此,这种向量到矩阵的对应是不平凡的。在文中,我们利用定理5.2刻划了给定向量到约束空间上特定矩阵的1-1对应,从而实现了Krylov子空间方法从向量到矩阵形式地等价转换,并给出了特殊约束方程的具体算法。此外,我们也用几个例子说明这些算法的数值效果。我们相信所提出的想法将适合于寻找别的矩阵形式的迭代法。

【Abstract】 In the recent decades, with the development of society and the progress of science, more and more new problems which are urgent to be solved arise from the engineer technology and many other domains. Many of these problems can be reduced to linear systems of matrix equations and/or least squares problems with linear constraints on solutions, Those systems include Sylvester equation, Lyapunov equation, inverse eigenvalue problem and the corresponding least squares problems which have been widely considered in many applied domain. More and more people have been attracted in this research field.Theoretically, a least squares problem in matrix form can be rewritten in vector form by Kronecker product. So solutions of the matrix least squares problem are known in terms of higher dimensional column-vectors. So does it for matrix equations since a linear system of matrix equations can be looked as a least squares problem with zero residual. However, this equivalent transformation from matrix LS to vector LS may enlarge the coefficient matrices of the linear systems unnecessarily, which results in a great computation cost and storage requirement. Moreover, possible special structure of the coefficient matrices in the matrix system will be lost in the vector-expressions of general solutions, even if matrix solutions are reconstructed from the vector-expressions. To avoid these difficulties, basically to avoid introducing Kronecker products in the structure analysis of solutions or formulating solutions, matrix-level discussion or analysis for solutions of matrix LS problems is much interesting and there have been some valuable efforts on this issue. The techniques commonly used are some matrix-factorizations such as the generalized singular value decomposition (GSVD), the canonical correlation decomposition (CCD), and the quotient singular value decomposition (QSVD). In general, these factorizations are used to reduce matrix equations to be simple diagonal-like form so that the new systems can be easily solved by suitably partitioning the new matrix-variables - some sub-matrices are determined immediately and others are obtained by solving a small linear system. In the meantime, iterative methods for numerically solving the least squares problems are also considered. Some problems with special constraints have been solved. In general, the produce factorization applied on matrix LS problems requires proficient skill, depending on the system of matrix equations and the constraints on solutions. It is difficult to directly apply a consistent factorization produce on those related but different problems. Furthermore, those approaches are generally not robust. Minor modifications on the constraint may give rise to great change on solutions. A natural question kept in our mind is that whether we can find a uniform approach for analysis or iteratively solving the linear system of matrix equations with linear constrains on solutions. This thesis will concentrate on this question and try to give answers, based on a new viewpoint and jumping-off point of these well-known problems. We will present a frame work in both theory and algorithms. We believe that the methods proposed in this thesis is enlightening and helpful for other constrained matrix LS problems.The contributions of this thesis are outlined as follows:1. We character the linear constraint matrix-spaces by their lower-dimension parameter vector spaces, using matrix-form basis. Those properties will be used to analyze and solve the least squares problems with linear constraints, and for arbitrary matrix, we also present its orthogonal projection in the constrained space.2. We uncover the inherent relations between the solution-space of the matrix LS with (relatively stronger) constraints and the solution-space to the matrix LS with weaker constraints. Whether the former solution space is contained by the latter uniquely depends on the right-hand side matrix T of the equation. We prove that there exists an equivalent transformationφ(T) on the right-hand side matrix T such that the solution space of the constrained problem is not changed when the r.h.s, matrix T is replaced byφ(T). More importantly, the solution space of the weaker-constrained matrix LS with the transformed r.h.sφ(T) contains all the solutions of the original problem with the (stronger) constraints. Therefore, if the solution space associate to the weak-constraints and r.s.hφ(T) is known or can be easily obtained, the solution set of the strong-constrained problem can be easily constructed - it is the intersection of the strong-constraint space and the solution set of the weak-constraint problem. We discover this latent properties by Theorem 3.5. This transformation can be (not unique) the orthogonal projection of the r.h.s matrix on to the range space of the linear system of matrix equations corresponding to the strong-constraints. We present the details of constructing the equivalent map. This development is important for solving linearly constrained matrix LS problems since the matrix LS with weaker constraints (or without constraints) may have been solved. We apply this technique on some constrained matrix LS problems and obtain simple matrix-representations of solutions. Indeed, this technique can be used to handle the matrix LS problems when more linear constraints are added.3. We discuss two kinds of structure-matrix equations with specific constraints. Theoretically, a matrix eigenvalue decomposition are used to remove the constraints in our discussion. However, the eigenvalue decomposition is also gotten off later. General solutions of the constrained equations are reconstructed. Note that the formula of the general solutions is simple and eigenvector-free.4. We propose some matrix-form Krylov subspace methods for solving linearly con strained matrix LS problems. Basically, they are the algorithms CGLS, GMRES, bidiagonalization Lanczos, and LSQR in matrix-iteration. The matrix-form Krylov subspace methods are not trivial resetting of the vector-Krylov methods. The matrix-Krylov methods are derived by theoretically removing the constraints (using basis of the constraint space), employing Kronecker product to form a linear system in long-vector form, and applying vector-form Krylov methods. However, Kronecker products are not involved in the resulting matrix-form Krylov methods. We propose Theorem 5.2 to accomplish this vector-matrix iterative form transformation. Numerical experiments are reported to show the numerical efficiency of the matrix Krylov methods. We believe that the basic ideas proposed can be generalized to find other matrix-form iterative algorithms as well.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2007年 06期
  • 【分类号】O241
  • 【被引频次】2
  • 【下载频次】722
节点文献中: 

本文链接的文献网络图示:

本文的引文网络