节点文献

基于几何代数表示原理的时间序列模式分类问题研究

Study on Time Series Pattern Classification Based on Geometric Algebra Presention Principle

【作者】 赵勇

【导师】 洪文学;

【作者基本信息】 燕山大学 , 仪器科学与技术, 2012, 博士

【摘要】 时间序列广泛存在于科学实验、经济金融、工业控制以及生物医学等各个领域。快速、有效的分析时间序列类型数据,挖掘其背后蕴含的信息有助于揭示事物发展变化规律,为正确认识事物本质和科学决策提供依据。时间序列数据挖掘作为数据挖掘的一个重要分支,具有重要的理论研究价值和现实应用意义。时间序列的模式分类是时间序列数据挖掘的重要任务之一。合理的时间序列表示是实现正确分类的前提和基础。针对目前时间序列的表示主要采用基于数值的串行方法,无法提取特征之间的关联信息,不能实现有效降维的问题,本文提出基于几何代数表示原理的时间序列模式分类方法。该方法将时间序列的多个特征嵌入几何代数空间进行并行表示与并行处理,实现了多特征的有效融合。具体主要完成以下三方面工作:首先研究单变量和多变量时间序列的多向量特征几何代数嵌入表示一般化模型;基于所提出模型,在对原始时间序列进行小波包多尺度分析基础上,提取四个节点的小波包系数特征并嵌入四元数并行表示;采用四元数主成分分析算法进行多尺度特征序列的并行降维处理,再对得到的四元数主成分进行几何积运算,利用几何积反映特征之间的关联信息。针对正常和癫痫两类脑电时间序列进行分类实验,讨论四元数主成分个数和小波类型对模式分类结果的影响,并与传统时间序列串行特征表示与提取方法进行对比。然后进行时间序列多维特征几何对象表示与几何结构特征提取方法研究。该方法将原始时间序列进行子空间划分,在每个时序子空间提取多维度特征并映射为高维空间中的点,将代表多子空间的特征点构成高维特征空间的几何对象表示。研究利用几何代数语言描述特征空间几何对象并利用几何代数运算法则提取几何结构特征的方法。在二维空间和三维空间构建心电信号的多形态特征三角形表示,确定最有效的几何结构特征参数,针对MIT/BIH心律失常数据库中五类心电时间序列进行模式分类。最后研究时间序列多特征四元数融合符号化表示与符号熵特征提取问题。给出时间序列多个高阶累积量几何代数嵌入表示方法,将高阶累积量嵌入四元数并行表示;计算四元数分量矩阵的2-范数值和行列式值作为多个高阶累积量的融合;基于符号聚合近似算法对融合特征值序列进行符号化表示并提取符号熵特征。针对失神性癫痫大鼠三个不同状态的脑电时间序列进行分类实验研究,给出不同脑电节律在癫痫发作不同状态下高阶累积量融合特征符号序列和符号直方图表示,讨论不同符号化参数时三类脑电信号的四元数融合符号熵特征。研究结果表明,本文提出的基于几何代数表示原理的时间序列模式分类方法能够简洁的并行表示原始时间序列的多个特征并进行有效融合,以此来实现正确的模式分类。几何代数可以作为时间序列数据挖掘一个新的数学工具,应用于其他更多领域的时间序列模式分类问题。

【Abstract】 Time series is widely applied in many fields, such as the scientific experiments,economy and finance, industrial control, and biomedicine. Efficient analysis of time seriestypes of date and mining the information contained in the data help reveal regular indeveloping and changing to provide evidences in correctly understanding the things natureand making scientific decision. Therefore, as an important branch of data mining the timeseries data mining has important theoretical research value and practical applicationsignificance.Time series pattern classification is an important task of the time series data mining.Reasonable time series representation is the prerequisite and basis for the correctclassification. There is the problem that related information among the features does notbe extracted and effective dimension does not be reduced, with serial fusion method usedby currently time series representation. This paper proposed a time series patternclassification method based on geometric algebra representation principle. This methodmakes the multiple characteristics of time series embedded in algebraic geometry spacefor parallel representation and parallel processing to achieve the effective integration ofmulti-features. The three parts of the paper are as follows:Firstly, this paper researches the generalized model of geometric algebra embeddedrepresentation for univariate and multivariate time series. Based on the proposed model,we extracted four nodes wavelet packet coefficient characteristics of original time series,giving the parallel representation of the embedd quaternion. Using quaternion principalcomponent analysis algorithm for parallel wavelet packet sequence dimension reduction,and carrying out geometric product operation for the quaternion principal components,Geometry product reflects the associated informations of the features. Classificationexperiments were done for normal and epileptic EEG time series data set, discussing theaffection for the classification results of two key algorithms parameters: the number ofquaternion principle components and wavelet type respectively. Meanwhile, we comparedwith the traditional time series serial feature representation and extraction methods.Secondly, this paper reasearches the geometry object representation of time seriesmulti-dimensional characteristics and geometry feature extraction method. The originaltime series was first divided into sub-space, then extracting multi-dimensional features ineach sub-sequence space and mapping to points in high-dimensional space, and giving the geometry object representation in high-dimensional feature space of the feature points,finally extracting the geometric features of the geometry object. One task of the reasearchis to use geometry algebraic language to describe geometric objects of feature space andusing geometry calculation to extract geometric features. In the two-dimensional spaceand three-dimensional space to build multi-ECG morphology triangles, determining themost effective parameters of the geometric structure and the pattern classification weredone on five types of ECG time series of the MIT/BIH arrhythmia database.Lastly, this paper researches the symbol representation of time series multi-featurequaternion fusion and symbolic entropy feature extraction method. Geometric algebraembedding representation of time series multiple high-order cumulant were given, thehigh-order cumulant were embedded in quaternion to represent in parallel, and calculatingthe2–norm values and the determinant value of quaternion components matrix as theintegration of multiple high-order cumulants. On the basis of symbolic aggregationapproximation algorithm, conducting symbolic representation and extracting symbolentropy features of the integration features sequence. Classification experiment researcheswere done on three different states of EEG time series data of absence epilepsy rats,symbolic representation of the high-order cumulant integration features of different EEGrhythms in different states of seizures, symbol histogram and quaternion integrationsymbolic entropy characteristics of three types of EEG under different symbols parameterswere given.The research results show the proposed time series pattern classification methodbased on the principle of geometric algebra representation can briefly represent multiplefeatures of the original time series in parallel and conduct effective integration andclassification. Geometric algebra can be used as a new mathematical tool for time seriesdata mining and can be applied to other areas of time series pattern classification.

  • 【网络出版投稿人】 燕山大学
  • 【网络出版年期】2012年 10期
  • 【分类号】TP311.13;O211.61
  • 【被引频次】2
  • 【下载频次】400
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络