节点文献

一种新的特征选择方法及其在路面使用性能分析中的应用

A New Feature Extraction and Its Application in Road Performance Analysis

【作者】 于哲夫

【导师】 贾传荧;

【作者基本信息】 大连海事大学 , 交通信息工程及控制, 2011, 博士

【摘要】 在高速公路管理信息系统中存有丰富的普查检测数据,这些普查数据可用于路面使用性能的综合评价,也可用于预测路面使用性能。实质上,这是对普查数据进行回归分析,这种回归分析需要满足以下要求:1、首先为保证评价和预测的精度,模型要具有非线性;2、受数据资料的制约,模型能够适用于处理小样本数据集;3、能够避免数据中噪音的影响;4、回归模型可以表达成简单易理解的显式,便于进行成因分析。这样的回归模型可以为公路养护的科学决策提供依据。用现有的回归方法在对上述的回归问题进行分析时,往往效果不好。比如支持向量回归机,在用规模很小的样本数据集进行训练时,得到的回归函数精度低,阶数失真。并且回归函数过于复杂,不能很好的体现输入与输出之间关系。如用神经网络回归,会产生过学习问题,不能得到回归函数,不能反映输入与输出之间关系。针对这些问题,本论文提出两种新的特征选择方法,将新方法应用于公路管理信息系统中的普查检测数据,可以得到新的路面使用性能综合评定方法,和新的路面使用性能预测方法。本论文的创新性工作主要体现在以下几点:(1)提出了一种基于矩阵相似性度量、遗传算法和支持向量机的特征选择方法。该方法使用矩阵相似性度量方法选择非线性空间,再通过遗传算法从非线性空间中选择特征,最后用线性支持向量机得到简明的回归函数或决策函数。实验证明在样本规模很小的情况下,该方法比其他方法回归精度要高。该方法所得的回归函数简单明了,便于进行成因分析,可以直观地建立起输入与输出之间的联系。同时在理论上阐明了矩阵相似性度量方法是一种有效控制VC维的方法。(2)提出了一种适用性更强的基于混合核函数、矩阵相似性度量和核主成分分析的序列极小化方法。进行核主成分分析时,使用的是混合核函数,其权值和形式参数是通过遗传算法,以矩阵相似性度量作为适应度,经过优化求得的,这样可以尽可能的控制核函数的复杂程度。使用序列极小化方法,可以对主成分做进一步的判别和选择,降低输入空间的维数,同时由于是线性的支持向量回归,不会增加学习机的VC维。经过验证该方法精度高于以往的类似方法。(3)将基于矩阵相似性度量、遗传算法和支持向量机的特征选择方法应用于路而使用性能的综合评价,可以克服样本数据规模过小的困难,将路面诸多损坏形式与路面使用性能之间的关系表达成简单易理解的多项式形式,便于进行综合评价组成分析。(4)将基于矩阵相似性度量、遗传算法和支持向量机的特征选择方法应用于路面使用性能衰变的预测。该方法可以克服路面养护信息系统中数据不完整的困难,将影响路而使用性能的诸多因素与路面使用性能之间的关系表达成简单易理解的函数形式,便于进行路面使用性能的成因分析。

【Abstract】 There are wealth of census data in highway management information system. The census data can be used for comprehensive evaluation of pavement performance, and used to forecast pavement performance as well. In essence, this is a regression analysis of census data, and it has following characteristics.1.The regression should be nonlinear in order to ensure the accuracy of evaluation and prediction.2.The regression can be applied to small sample data sets.3.The regression Also can avoid the effect of the noise in data.4.The regression model should be an explicit function which is simple and easy to analyze the causality. The evaluation model and the forecasting evaluation model can provide a strong base for decision-making on road maintenance.In such practical problems as above, the existing regression methods are ineffective. Such as support vector regression trained by small sample data set is easy to fall in overfitting. The precision of regression function is low. The degree of regression function is distorted. Using neural network method cannot get an explicit function, and can not reflect the relationship between input and output. To solve these problems, two new features extraction methods are proposed. Using the new methods in highway management information system, we get a new comprehensive evaluation and prediction of pavement performance.The innovations of this paper are as following:(1) A feature extraction method based on matrix similarity measurement, genetic algorithm and linear support vector regression is proposed in this paper. Firstly, the nonlinear space is selected by using matrix similarity measurement. Then features are extracted from the nonlinear space by GA. A regression function is gotten by linear SVR. Experiments prove that the precision is higher than other methods when the sample size is small. The regression function gotten by this method has a simple and clear form. This facilitates the causality analysis. It is intuitive to set input-output model. In addition, it is proved that the matrix similarity measurement is effective to control VC dimension.(2) A sequence minimization based on mixed kernel, matrix similarity measurement and kernel principal component analysis is proposed. The mixed kernel is used in KPCA. The parameters of the mixed kernel are determined by GA, while the matrix similarity measurement serves as the fitness. So one can control kernel complexity as much as possible. A sequence minimization method is used to choose principal component, and the dimension of input space is reduced further. It will not increase the VC dimension of the learning machine because sequence minimization method is a linear SVM. Experiments prove that this method is better than previous methods.(3) The feature extraction method based on matrix similarity measurement, GA and linear SVR is applied to pavement performance evaluation. The difficulties caused by small training data set is avoided. A simple polynomial function can be gotten to express the relationship between pavement performance and all kinds of damage on road. This function makes it easy to analyze the causality.(4) The feature extraction method based on matrix similarity measurement, GA and linear SVR was applied to pavement performance prediction. A simple polynomial function is clear to express the relationship between pavement performance and all kinds of factor. This function provides a sound basis for decision-making on road maintenance.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络