节点文献

统计学习理论及其在地学中的应用研究

Study on Statistical Learning Theory and Their Applications in Geoscience

【作者】 王忠文

【导师】 范继璋;

【作者基本信息】 吉林大学 , 地球探测与信息技术, 2007, 硕士

【副题名】支持向量机在地学中的应用研究

【摘要】 本文的目的是将统计学习理论的思想方法,引入到地学信息处理的非线性方法研究中。研究统计学习理论,支持向量机方法的数学模型、算法及其程序实现,为实现地学信息的非线性处理提供技术支持。由于地学数据具有多尺度、多时段、多精度、多比例尺和多解性等特点,这就造成了观测数据与研究对象的本质之间的对应关系具有非线性特征,因此,地学信息处理需要非线性方法,支持向量机可以将非线性空间的问题转换到线性空间中解决,所以适合进行地学信息的非线性处理。文中首先论述了支持向量机的基础理论--统计学习理论,然后论述了利用结构风险最小化原理代替经验风险最小化准则的理论,解决在有限样本下利用渐进理论估计期望风险的缺欠问题,支持向量机是结构风险最小化思想的具体实现。最后本文利用最小二乘法支持向量机,针对地学中的若干具体问题进行了仿真试验,并将实验结果与多元统计方法、bp神经网络方法比较分析。

【Abstract】 The SLT is one kind of small sample statistical theory which is proposed by Vapnik and tother people.It emphatically studies the statistical rule and the study method in the small sample situation.The SLT has built and established a better theory frame for the machine learning question, also developed one kind of new general study algorithm to support SVM, it is also the better solution to resolve small-sample learning question.At present, SLT and SVM have become new research hot topic in the international machine learning field. Vanpik and hisAT&TBell laboratory research team, proposed one kind of new classified technology-SVM which has extremely potential in 1963. SVM is one kind pattern recognition method based on the statistical learning theory, mainly applies to the pattern recognition field. In 90’s SLT’s realization andfor the comparatively emerging machine learning method research like the nerve network encounters some important difficulties, for instance , how determine the network architecture、overlearning、Insufficient learning and the Partial Minimum point question and so on, causes the SVM rapid development and the consummation. The SVM displays many unique superiority in solution small sample question、Non-linearity question and High dimension pattern recognition question. SVM henceforth the rapid development, now already in many domains(biological information study, text and handwriting recognition and soon) has all obtained the successful application The nuclear function which is the most gratifying in SVM .Lower dimension space vector collection usually with difficulty to divide, the solution method is maping them to the high dimension space. But this brings the difficulty which is the computation complex increasing,but the nuclear function just right has solved this problem ingeniously. In other words, so long as selecting the suitable nuclear function, we may obtain the high dimension space classified function. In the SVM theory.the different nuclear function will cause the different SVM algorithm.In the 60 to 70’s 20th century,Studing in the geological research gradually introduces the mathematical method and technology. According to the type of geological application, these statistical models includes: moving average single/ multiple -element regression forcast,gray system forcast and so on;classifing and pattern recognition; distance Cluster Analysis; bayesian classifier; maximum likelihood classification; correlation analysis ;Factor analysis (principal components analysis) and so on; Optimization; Appraisal and plan; Linear programming; fuzzy comprehensive evaluation and analytic hierarchy process and so on. Of course,these mathematical methods has the positive impetus function to transformed the geological study from the description science to the quantificational science.However,these methods has exposed many malpractices when processing non-linear problem. In fact, occupies the dominant position in the geological research is the high dimension non-linear complex question. 90’s intermediate stages, the people apply the nerve network model in the geological analysis. At present,more than 10 years’research, the artificial nerve network performance already the enormous enhancement, also was extremely widespread in the geologcial application domain, nearly has covered all domains. Its function like non-linear pattern recognition, classification,forecast, optimization, control and so on is widely applied.Certainly, the method in the information processing must determined by the variable and the data nature. When variable is shortage, and the relations between phenomenonand and the essential quite are explicit, we may use the logical inference directly to draw the conclusion; If the research phenomenon can occur under some kind of probability condition, then may use the probability statistical analysis method to study. Certainly, the method in the information processing must determined according to the variable and the data nature. When variable is shortage, and the relations between phenomenonand and the essential quite are explicit, we may use the logical inference directly to draw the conclusion; If the research phenomenon can occur under some kind of probability condition, then may use the probability statistical analysis method to study.In variable sufficient situation, relations between the phenomenon and the essential quite are complex, generally drawing the conclusion directly depending on the logical inference is difficult.by now, if the variable only contains the quantificational variable, may apply multiple- element statistical analysis; When includes the qualitative variable, may use quantification theory.Because the Earth’s origin、the evolution and the developing process are not the repeatability,and the Historical reason of the humanity science and technology development, geological data has many characteristics for instance: the multi- criteria、the multi- time intervals、the multi- precisions、the multi- scales and multi-results, this has created the corresponding relations between observation data and the research object essence with the non-linear characteristic,therefore,geological data and information processing needs the non-linear method. When the limited number of samples is difficult to obtain ideal results. Actually we have the training and practice samples are limited, and it is theoretically very mature approach In practical application’s performance is unsatisfactory. For example, BP algorithm, the optimized process falls into the minimum question; the overlearning question.But to one’s excited ,for SVM’s some merits, can satisfy the geological work’s need, therefore this article introduces SVM in the geological research.And we preliminary use Least squares method SVM in some gold ore simulation experiment to classify the chemical Exceptionally,and the accurate rate is 86%.At present, the SVM research had the partial achievements, but mostly only limits to the simulation experimental stage, by far the application research is insufficient, this article aims at appling SVM to the geological research,and enrich the quantitative analysis theory, and proposes the idea and the method when solve the complex non-linear geological problem.

【关键词】 统计学习理论支持向量机地球科学LS-SVM
【Key words】 statistic learningSVMgeological scienceLS-SVM
  • 【网络出版投稿人】 吉林大学
  • 【网络出版年期】2007年 03期
  • 【分类号】P628.2
  • 【被引频次】4
  • 【下载频次】499
节点文献中: 

本文链接的文献网络图示:

本文的引文网络