节点文献

动态图像序列建模与分类及其在人体运动分析中的应用

Dynamic Image Sequence Representation and Classification with Application to Human Motion Analysis

【作者】 陈昌红

【导师】 焦李成; 梁继民;

【作者基本信息】 西安电子科技大学 , 电路与系统, 2009, 博士

【摘要】 动态图像序列是由一系列具有相对次序的帧图像组成。除了具有与图像一样的空间特性外,动态图像序列还具有时序特性,即运动信息。动态图像序列建模对序列中的运动模式进行分析和识别,并用自然语言加以描述,是计算机视觉领域的重要研究方向之一,在智能监控、人机交互、动作分析等领域有非常重要的应用。有效的检测和描述图像序列中的运动信息是动态图像序列建模的核心问题。时间序列分析方法是一种成熟的解决时间序列问题的统计工具,但用于分析图像序列则会出现很多问题。本文以图模型理论,特别是分析时间序列的图模型——动态贝叶斯网络为基础,对隐马尔可夫模型和状态空间模型这两种最常用的处理时间序列的方法在处理图像序列时遇到的问题进行了讨论,在此基础上对这两个模型的学习方法做了改进,并提出了更适合描述动态图像序列的模型,通过人体运动分析对我们的工作做了检验和评估。为克服单个特征的片面性,对图像序列进行更全面、准确和可靠的描述,引入了因子隐马尔可夫模型作为特征级融合方法和并联隐马尔可夫模型作为决策级融合方法,并根据算法的评估结果及特征间的相关性,深入探讨了影响融合性能的因素。当选择特征间识别性能差异小和单个特征识别性能好的特征组合时,两种融合方法会有效的提高识别性能。在此基础上,特征间的相关性越小越好。针对动态纹理模型不适合描述二值图像序列的问题,提出了两种改进动态纹理模型。第一种是二值动态纹理模型,该模型利用二值图像服从贝努利分布的特点,采用二值主成分分析的方法学习模型参数;第二种是张量子空间动态纹理模型,该方法采用张量子空间分析的方法将二值图像序列转换为低维灰度图像序列,然后使用动态纹理模型进行描述。实验结果表明,这两种改进的动态纹理模型能更准确的描述二值图像序列。隐马尔可夫模型及其扩展形式在描述动态图像序列时的最大局限在于难以准确的选择隐状态,而改进的动态纹理模型作为线性模型,无法对图像序列进行全面的描述。为解决上述不足,本文提出了分层时间序列模型。该模型是一个两层统计模型,第一层采用分层线性来逼近非线性,将图像序列分割为几段,每段用动态纹理模型或改进动态纹理模型来描述;第二层将这些模型看作隐状态,并在此基础上建立隐马尔可夫模型来刻画它们之间的关系,其中隐马尔可夫模型的观测概率是观测与各模型的对应合成观测之间距离的函数。两层模型参数的组合即为分层时间序列模型的参数。实验结果表明,该模型克服了隐马尔可夫模型和动态纹理模型的缺陷,保留了动态纹理模型的合成及预测能力和隐马尔可夫模型对过程的描述能力,是一种很好的描述和识别动态图像序列的方法。针对步态数据库中由于去除背景等原因带来的人肢体信息缺失问题,本文还提出了帧差主体图的表征方法。该表征方法通过主体图表现每类的共性,用前一帧与当前帧的帧差图中值为正的部分来表示运动中带来的变化。通过结合动态和静态信息,该方法既能有效地弥补图像信息缺失的不足,又能体现人行走过程中的形态变化,提高识别率。此外,本文引入了McNemar检验的统计方法对不同算法进行评估,该方法对不同算法的多次实验的结果进行统计,计算得到不同算法间差异的一阶统计显著性,从而实现对算法性能的定量比较。与其它方法相比,这种方法对算法的评估更准确、更可靠。

【Abstract】 Dynamic image sequence is composed of a series of image frames with comparatively given order. Besides the spatial characteristic as a single image, dynamic image sequence also possesses of temporal characteristic, which is motion information. Dynamic image sequence modeling analyses and identifies the movement patterns of the sequence and describes them with natural language. It is one of the most promising research topics of computer vision and has important application on smart surveillance, man-machine interactive, motion analysis and so on.Effective detecting and describing the dynamic information in the image sequence is the core issue of dynamic image sequence modeling. Time series analysis methods are perfect statistical tools for analyzing the time series. However, many problems appear when applying them to modeling dynamic image sequence. Based on the theory of graphical model, especially the special graphical model for time sequence - Dynamic Bayesian Networks, we discuss the problems arising during the application of the two most popular time series models (hidden Markov model and state-space model) to the dynamic image sequence representation and improve the learning method of the models. We also bring forward a new time series model which is more suitable for the image sequences and evaluate our work in human movement analysis.In order to complement the deficiency of single feature and describe the image sequence more comprehensive, exact and credible, we introduce the factorial hidden Markov model as a feature-level fusion method and the parallel hidden Markov model as a decision-level fusion method. According to the experimental results and the correlation between features, we analyse the factors impacting the fusion performance in depth. When choosing features with small performance diversity and good performance, the two fusion algorithms improve the recognition performance efficiently. Basing on this foundation, the lower is the features’relativity, the better.Because dynamic texture model is not applicable to represent binary image sequence, we propose two improved dynamic texture models. The first is the binary dynamic texture model, which considers that binary image submits to Bernoulli distribution and adopts binary principal component analysis to learn the model parameters. The second, tensor subspace dynamic texture model, employs tensor subspace analysis to transform binary image sequence to low dimensional gray image sequence and then use dynamic texture to describe it. The experimental results show that the two improved models can describe the image sequence more exactly.Hidden Markov model and its extension have great limitation on choosing the hidden states exactly. Improved dynamic texture models are linear and have difficulties in describing image sequence comprehensively. We present a layered time series model to resolve the aforementioned deficiencies. The proposed model is a two-level statistical model. In the first level, we employ segmental linearity to approximate nonlinearity. The image sequence is divided into several clusters and each cluster is described by the dynamic texture model or improved dynamic texture model. In the second level, these models are considered as the hidden states and the hidden Markov model is built to characterize the relationship among them. The observation probability of the hidden Markov model is a function of the distance between the observation and the corresponding synthesized observation of these models. The combined parameters of the two levels are the parameters of the hierarchical time series model. The experiments results show that this model overcomes the limitation of hidden Markov model and dynamic texture model and remains the synthesis and prediction ability of the dynamic texture model and the description capability of hidden Markov model for dynamic process. It is an excellent model for describing and identifying the image sequences.We propose a new representation method named frame difference energy image to depress the influence of silhouette incompleteness caused by background subtraction in the gait database. This representation preserves the static information of each cluster and the positive portion of the frame difference between the former frame and the current frame. Combining static and dynamic information, not only does this method efficiently make up the deficiency of missing image information, but also embodies the change of human shape during walking and improves the recognition rate.In addition, the McNemar’s test is introduced to evaluate different algorithms. This method needs many experimental results and a first order check on the statistical significance of an observed difference between algorithms is calculated to quantitatively compare their performance. Compared with other methods, this method is more accurate and credible for algorithm evaluation.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络