节点文献

基于局部模型的时间序列预测方法研究

Research on Local Model-Based Time Series Prediction

【作者】 王军

【导师】 彭喜元;

【作者基本信息】 哈尔滨工业大学 , 仪器科学与技术, 2007, 博士

【摘要】 时间序列分析一直受到国内外学者的广泛重视,成为一个具有重要理论和使用价值的热点研究课题。时间序列预测是时间序列分析中的主要研究任务,在工业自动化、水文、地质、股市以及军事科学等领域都有着广泛的应用。目前,时间序列预测主要采用全局模型进行预测,其建模效率低、预测性能不佳、模型实时更新的计算复杂度高。近年来,人们开始将数据挖掘、模式识别、信号处理、混沌等理论及技术融合到时间序列的预测研究中,通过对时间序列数据进行时域或频域划分,在各个局部时频区域进行预测建模。时间序列预测的局部模型不仅可以提高预测精度,而且可以降低时间序列预测模型的复杂度和预测建模的计算复杂度。但是,基于局部模型的预测方法仍然有许多问题需要解决。本文从分解域和时间域两个方面研究时间序列预测的局部建模方法,重点讨论了经验模式分解端点效应处理、分解域局部模型选择与实时更新、任意形状簇时间序列的自适应聚类、时间序列分类的非线性特征提取及快速属性约简、局部时域支持向量预测建模及增量更新等问题。本文所取得的主要创新性成果包括:第一,针对经验模式分解存在端点效应问题,本文提出基于相似性搜索的序列延拓方法进行端点效应抑制。该方法利用线性时间序列或非线性时间序列本身的自相似性,查找序列中与端点处模式相似之处的前续或后续子序列进行时间序列延拓,这使得延拓的子序列更接近于时间序列可能的前续或后续序列,从而大大降低了端点效应。此外,由于采用快速最近邻搜索算法进行相似子序列的搜索,基于相似性搜索的序列延拓方法的计算复杂度很低。仿真实验验证了该端点效应抑制方法的有效性。第二,本文在分解域各个内禀模态函数(Intrinsic mode functions, IMF)分量中采用径向基神经网络和增量核空间独立向量组合预测算法进行预测建模,但是分解造成了模型参数选择的计算负担。为了解决该问题,本文提出仅进行两个分量的模型参数选择,而其他分量的模型参数则利用局部分量建模最优参数取值与各个IMF之间的关系计算得到,从而大大降低了分解域预测建模的计算负担。此外,针对RBF神经网络实时更新慢的缺点,本文在各个IMF分量预测建模中提出增量核空间独立向量组合预测算法,该算法的计算复杂度低。仿真实验验证了在分解域各个IMF分量采用RBF神经网络和增量核空间独立向量组合预测算法进行预测的性能优于单一预测模型。第三,针对目前聚类的簇数目估计有效性准则泛函不能有效地估计出正确簇数目,本文引入正则化思想提出基于惩罚方法的簇数目估计准则泛函,该泛函随簇数目变化的曲线是单峰或近似单峰,这使得使用该泛函估计得到簇数目更准确更鲁棒。仿真实验验证了该有效性准则泛函可以有效地估计出正确或接近正确的簇数目。第四,针对目前粗糙集属性约简算法的计算复杂度仍然较高的问题,本文提出基于函数映射的粗糙集快速属性约简算法,该算法利用空间逐渐收缩的最近邻搜索算法实现各个样本到各不可区分关系的快速映射,从而可以大大降低原有算法的计算复杂度。仿真结果表明基于函数映射的粗糙集快速属性约简算法随数据规模变化和数据维数变化的伸缩性好。

【Abstract】 Time series analysis has caught the focus of many researchers, and becomes a hot research field with great theoritical value and application value. Time series forcasting is the main task of time series analysis, and has been widely applied in many fields such as industry automatic, Hydrological, Geology, stock market, military science and so on.Nowadays, global model is the main tool for time series predicting, but it suffers low prediction efficiency, low prediction accuracy and high computation complexity for model updating. In recent years, the techniques such as data mining, pattern recognition, signal processing and chaos so on are incorporated into time series prediction, which divides the time series data in time domain or frequency domain and constructs the prediction models in local time domain or frequency domain. Local model for time series prediction can obtain more accuracy prediction results, and has lower complexity of models and lower computation complexity of modeling. But there are still many problems worth doing some research. In this paper, local modeling for time series prediction is discussed in the decomposition field and time field respectively, and mainly research focuses on the boundary effect processing method for empirical mode decomposition, model selection and updating in decomposition field, adaptive time series data clustering with arbitrary shape, nonlinear feature extraction and fast attribute reduction of time series classification, time series prediction modeling in local time field with support vector and so forth. The main contributions of this dissertation are as follows:Firstly, similarity searching based boundary effect processing method for empirical mode decomposition is proposed. This method utilise the property of local self-similarity of a nonlinear or linear signal to extend series and get extra extrema for spline interpolation, make extended series more similar to the real series before the fore-endpoint or behind of the back-endpoint and reduces the boundary effect greatly. Furthermore, the computing complexity is low for using fast nearest neighbor searching.Experimental results validated the effectiveness of this method. Secondly, RBF network and Incremental Independent Vector Combination Predicting algorithm in Kernel Space are proposed to construct the forcasting model for every intrinsic mode function component of time series decomposed with empirical mode decomposition method. But empirical mode decomposition aggravates computation for parameter selection of forcasting models. In order to resolve this problem, model parameter selection is done only for two IMFs, and other model parameters are computed by the relationship between the model parameters and the IMFs, and computation burden for model selection is alleaviated. Furthermore, for low updating efficiency of RBF network, Incremental Independent Vector Combination Predicting algorithm in Kernel Space is proposed and has lower computation complexity. Experimental results show that time series prediction in the empirical mode decomposition domain with RBF neural network and IIVCPKS algorithm can obtain more accuracy prediction than single model.Thirdly, a novel validity index for clustering with penalizing method is proposed to estimate the cluster number. A penal factor is introduced into this validity index and makes the curve of this validity index convex-like or almost convex-like. A more accuracy estimated cluster number can be obtained by minimize this validity index. Experimental results show that the proposed validity index can estimate correct or almost correct cluster number.Finally, efficient rough-set-based attribute reduction algorithm with function maping is proposed. In this algorithm, a fast nearest neighbour searching method with gradually shrinking search space is proposed to reduce the computing complexity of indiscernibility relation, positive relation and so on. Experimental results show that the proposed algorithm computed attribute reduction more efficiently and had good scalability with data size and data dimension.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络