节点文献

Sum-of-Product神经网络和径向基函数神经网络的逼近能力研究

Approximation Capabilities of Sum-of-Product Neural Networks and Radial Basis Function Neural Networks

【作者】 隆金玲

【导师】 吴微;

【作者基本信息】 大连理工大学 , 计算数学, 2008, 博士

【摘要】 神经网络理论在近年来得到了迅速发展.神经网络的逼近能力是考察神经网络性能的重要一环.实际应用问题中要逼近的映射通常非常复杂,我们不能期待完全精确计算这些未知的映射.现在比较流行的趋势是用神经网络计算一元函数或其它简单函数的复合和线性组合逼近静态映射.这与如下的问题相关:是否,或在什么条件下,一族神经网络输出函数在某个多元函数空间中稠密?即神经网络的逼近能力的研究.神经网络的逼近能力问题作为神经网络的一个基本问题,随着神经网络的发展,引起了工程界和数学家们的广泛关注.稠密性是理论上能够逼近函数的能力,满足稠密性并不意味着这种形式是一种有效的逼近格式.然而,缺少稠密性的保证就意味着一些网络是不可能作逼近应用的.对于神经网络的逼近问题,在数学上讲可以分为四个方面:函数逼近,函数族逼近(强逼近),连续泛函逼近以及连续算子逼近.迄今,人们提出了很多神经网络模型,应用最广泛的是前馈神经网络,所以各种前馈网络的逼近能力的研究任务更加急迫.学者们对径向基函数(RBF)神经网络的逼近能力已有了深入的研究,然而已有研究结果仍需要发展和完善.同时,学者们在研究神经网络对函数族的逼近能力时,都是利用了已有的多层感知器(MLP)和RBF神经网络的函数逼近能力定理,得到了这两种不同网络的强逼近结果,那么对一般的前馈网络的函数逼近性和强逼近性之间是不是也存在着这种联系呢?这一问题对提出统一的逼近理论框架具有重要的实际意义.Sum-of-Product神经网络和Sigma-Pi-Sigma神经网络是分别于2000年和2003年提出的,它们都是由求积神经元和求和神经元构成的多层神经网络,试图解决经典RBF网络和MLP网络遇到的存储记忆量大和学习困难的问题.这两种网络在函数逼近、预测、分类和学习控制任务中都有很好的表现.本论文分别讨论了这两种神经网络的一致逼近能力和Lp逼近能力.已有的神经网络逼近理论主要是存在性地证明了神经网络的逼近能力,我们应用一种构造型方法证明了具有RBF型和平移伸缩不变(TDI)型隐单元的三层前馈神经网络只需随机选择隐单元的权值参数,然后适当调整新增的隐单元和输出单元之间的权值,网络输出函数就能够以任意精度逼近L2(Rd)中任意函数.同时,我们的结果给出了一种自然地建立渐增网络逼近L2(Rd)中函数的方法.形如g(a·x)的岭函数及其线性组合,在拓扑学、神经网络、统计学、调和分析和逼近理论中都有广泛应用.这里g是一元函数,a·x表示欧氏空间Rn中内积.确定在什么程度上函数表示成岭函数的和的表达方式是唯一的,是非常重要的课题.已有的这方面的研究结果考虑的是g∈C(R)和g∈Lloc1(R)的情况,我们将相应的结论推广到g∈Llocp(R)(1≤p<∞)和g∈D’(R)的情况.另外,如果一个函数能够表示成岭函数的和,函数本身和每个和分量的光滑性之间的关系也是本论文关心的问题.本论文的结构和内容如下:第一章回顾了神经网络的相关基础知识,介绍了神经网络的逼近能力理论研究意义、方法和研究现状.第二章主要研究了一个函数如果能够表示成岭函数的和,其表达式的唯一性问题.我们证明了如果f(x)=∑i=1m gi(ai·x)=0,ai=(a1i,…,ani)∈Rn\{0}两两线性无关,并且gi∈Llocp(R)(或gi∈D’(R),gi(ai·x)∈D’(Rn)),那么每个gi是一个次数不超过m-2次的多项式.此外,我们还给出了岭函数线性组合的一个光滑性定理.第三章给出了RBF神经网络在Lp空间中的函数逼近能力以及强逼近和算子逼近能力的结果.这些结果改进了陈天平和蒋传海等人最近在RBF神经网络逼近方面的结果,为RBF神经网络的应用提供了理论基础.另外,我们还得到了前馈神经一般形式的强逼近定理,现有的很多结果都是它的特例.第四章指出了R上的连续函数作为Sum-of-Product神经网络的激活函数时,网络所生成函数集合在C(K)中稠密的充分必要条件是它不是多项式.进一步地,我们还给出了Sigma-Pi-Sigma神经网络所生成的函数集合在C(K)中稠密的充分必要条件.第五章揭示了Sum-of-Product神经网络所生成的函数集合在Lp(K)中稠密的充要条件.另外,我们根据Sum-of-Product神经网络的逼近结果,讨论了Sigma-Pi-Sigma神经网络的Lp逼近能力.第六章研究了具有随机隐单元的三层渐增前馈神经网络对L2(Rd)中函数的逼近能力.主要讨论了具有RBF型和平移伸缩不变(TDI)型隐单元的前馈神经网络.我们指出了对于具有RBF型隐单元的网络,给定非零激活函数g:R→R且g(‖x‖Rd)∈L2(Rd),或者对于具有TDI型隐单元的网络,给定非零激活函数g(x)∈L2(Rd),如果适当选择隐层单元和输出单元之间权值,则具有n个随机隐单元的三层渐增网络的网络输出函数当n→∞时以概率1收敛于L2(Rd)中任意目标函数.

【Abstract】 In recent years, neural network theory has developed rapidly. Approximation theory of neural networks is important for analyzing the computation capability of neural networks. Mappingsin approximation applications are usually very complicated. Moreover, we can not expect to be able to compute exactly the unknown mappings. Thus, a current trend is to use artificial neural networks to approximate multivariate functions by computing superpositions and linear combinations of simple univariate functions. This is related to the density problem of neural networks: whether, or under what conditions, is a family of neural network output functions dense in a space of multivariate functions, i.e., approximation capability of neural networks. Approximation capability of neural networks, which is a basic problem in neural networks, has aroused extensive attention among engineers and mathematicians along with the development of neural networks. Density is the capability to approximate functions in theory, but denseness does not always give an effective scheme. A class of networks can not be used for approximationwithout guarantee of denseness. From a mathematical point of view, the approximation problem of neural networks can be studied from four aspects: function approximation, approximationof families of functions (strong approximation), functional approximation and operator approximation. Many neural network models have been proposed so far. The feedforward neuralnetworks are most widely used in applications, so it is important to study approximation capabilities of various feedforward neural networks.There have been deep investigations on approximation capability of radial basis function (RBF) neural networks. But the known results still need to be improved. Meanwhile, the approximation capability theorems of RBF neural networks and multilayer perceptron (MLP) neural networks are used in the investigations of their approximation capability to families of functions. Thus, we will ask: is there similar relationship between approximation of functions and family of functions for general feedforward neural networks? It is desirable to propose an integrated theoretical framework for the above problem. Sum-of-Product neural networks (SOPNN) and Sigma-Pi-Sigma neural networks (SPSNN) are proposed in 2000 and 2003, respectively. Product and additive neurons are their basic units. The new structures overcome the extensive memory requirement as well as the learning difficulty for MLP neural networks and RBF neural networks. They have novel performance in function approximation, prediction, classification and learning control. We discuss both the uniform and Lp approximation capabilities of them.In comparison with the conventional existence approach in approximation theory for neural networks, we follow a constructive approach to prove that one may simply randomly choose parameters of hidden units of three-layered Translation and Dilation Invariant (TDI) neural networks and RBF neural networks, and then adjust the weights between the hidden units and the output unit to make the networks approximate any function in L2(Rd) to arbitrary accuracy. Furthermore, the result we obtained also presents an automatic and efficient way to construct an incremental three-layered feedforward networks for function approximation in L2(Rd).Ridge functions in the form of g(a·x) and their linear combinations are widely used in applications on topology, neural networks, statistics, harmonic analysis and approximation theory, where g is a univariate function, and a·x denotes the inner product of a and x in Rn. When we study a function represented as a sum of ridge functions, it is fundamental to understand to what extent the representation is unique. The known results consider two cases:g∈C(R)and g∈Lloc1(R).We draw the same conclusion under the conditions g∈Llocp(R)(1≤p<∞),or g∈D’(R) and g(a·x)∈D’(Rn).Provided that a function is represented by a sumof ridge functions, the relationship between the smoothness of the given function and the sum components is also analyzed.This thesis is organized as follows:Some background information about feedforward neural networks is reviewed and the significanceof approximation capability theory of neural networks is introduced in Chapter 1. The methods usually used and progress in researches on approximation capability theory of neural networks is also presented in this chapter.Investigated in Chapter 2 is the uniqueness of representation of a given function assome sum of ridge functions. It is shown that if f(x)=∑i=1m gi(ai·x)=0,ai=(a1i,…,ani)∈Rn\{0}are pairwise linearly independent, and gi∈Llocp(R)(or gi∈D’(R),gi(ai·x)∈D’(Rn)),then each gi is a polynomial of degree at most m - 2. In addition, atheorem on the smoothness of linear combinations of ridge functions is also obtained.Chapter 3 mainly deals with capability of RBF neural networks to approximate functions, family of functions, functional and operators. Besides, we follow a general approach to obtain approximation capability theorem for feedforward neural networks to a compact set of functions.The results can cover all the existing results in this respect.It is proved in Chapter 4 that the set of functions that are generated by SOPNN with its activation function in C(R) is dense in C(K), if and only if the activation function is not a polynomial. The sufficient and necessary condition under which the set of functions generated by SPSNN is dense in C(K) is also derived. Here IK is a compact set in RN.In Chapter 5, we give a sufficient and necessary condition under which the set of functions that are generated by SOPNN is dense in Lp(K). Based on the Lp approximation result of SOPNN, the Lp approximation capability of SPSNN is also studied.Chapter 6 studies approximation capability to L2(Rd) functions of three-layered incrementalconstructive feedforward neural networks with random hidden units. RBF neural networks and TDI neural networks are mainly discussed. Our result shows that given any non-zero activationg : R+→R and g(‖x‖)∈L2(Rd) for RBF hidden units, or any non-zero activationfunction g(x)∈L2(Rd) for TDI hidden units, the incremental network output function with nrandomly generated hidden units converges to any target function in L2(Rd) with probability 1 as n→∞, if one only properly adjust the weights between the hidden units and output unit.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络