节点文献

运动成像平台近景视频运动目标检测技术研究

Research on Close Range Video Moving Object Detection from Moving Platform

【作者】 孙浩

【导师】 王博亮;

【作者基本信息】 国防科学技术大学 , 电子科学与技术, 2011, 博士

【摘要】 随着科学技术的进步,人们现在越来越容易获取和存储各种视频,数字视频信息出现了飞速膨胀。视频运动目标检测是视频内容分析的基础,在科学研究和工程应用上都有着十分重要的意义,其中运动成像平台近景视频运动目标检测技术,由于目标景深变化大、多目标间的静态或动态遮挡、不可预知的场景变化、目标运动和成像平台运动等多种因素的影响,是一个值得重点关注的难点问题。运动成像平台条件下运动目标检测方法可以分为基于运动分析的方法和基于统计学习的方法。本文围绕运动成像平台近景视频运动目标检测问题,深入研究了利用局部不变特征表达视频图像数据,利用运动分析和统计学习策略分析视频图像数据,从而发现和定位运动目标的方法。在基于局部不变特征的视频内容表述方面,主要研究内容为:(1)提出利用空域的局部近邻关系和时域的运动相似关系来定义和描述视频局部不变特征的时空上下文信息,增强视频局部特征的描述力。(2)提出在视频图像数据中引入图像空间金字塔表示,在多个层次上描述目标各部分之间的结构信息,有效的融合了目标的局部外观信息和全局结构信息。(3)提出了一种面向视频图像特征匹配的闭合回路特征匹配方法,在保证特征匹配数量的同时提高了匹配的可靠性。在基于运动分析的运动目标检测方面,提出了一种基于多视几何约束的运动成像平台近景视频运动目标检测方法。方法的主要创新为:(1)针对双目会聚式立体视觉,提出了一种基于四个视角的多视极线约束,有效地解决了当相机和目标沿相同方向运动时对极几何约束无法检测运动目标的问题。(2)提出在粒子滤波的框架下,采用自适应状态预测和多视极线约束状态观测的更新策略同时检测和跟踪多个运动目标,有效地处理了运动成像平台下多个运动目标在多个时刻进入或离开视野的情况。在基于无监督统计学习的运动目标检测方面,提出了一种基于动态主题发现的运动成像平台近景视频运动目标检测算法。基于鲁棒的稀疏时空上下文视觉词汇表示形式和无监督的学习策略,我们采用概率主题模型对运动成像平台近景视频数据中的运动目标进行建模,我们的模型本质上包含两个过程:(1)在特征层次上,提取对姿态、尺度和光照变化等因素不敏感的显著性视频块,这些视频块包含了多种不同类别运动目标的局部信息。(2)在目标层次上,利用多帧图像间的结构模式和运动模式的相似性,构建目标模型。在基于有监督统计学习的运动目标检测方面,以红外行人目标为例,提出了一种基于判别式模型的运动成像平台近景视频特定类别运动目标检测算法。算法的主要创新为:(1)提出了一种基于特征点滑动窗口搜索的感兴趣区域提取方法,方法能够在不同的场景条件下稳定地提取候选行人区域,保证检测率的同时大大减小了滑动窗口搜索的候选区域,提高了处理效率。(2)提出了一种金字塔二进制模式特征,同时利用局部纹理信息和全局结构信息来描述红外图像人体目标,并将其扩展为三维的动态金字塔二进制模式特征用于描述红外视频中的行人目标。

【Abstract】 With the development of scientific technology, people nowadays can easily acquire and save a variety of videos. As a result, digital video data has become abundant. Moving object detection is part of video content analysis, and plays an important role in both scientific research and engineering applications. Especially, moving object detection in close range videos from moving platform has received more and more attention. Due to the combined effects of variations of object depth, stationary or dynamic occlusion between multiple objects, unpredictable scene structure, object as well as camera movement. The two typical groups of method on moving object detection from moving platform are methods based on motion analysis and methods based on statistical learning. In order to dicover and localize moving objects in close range videos from moving platform, this paper focus on the use of local invariant features to represent the video data, the use of motion analysis strategy and statistical learning strategy to analyze the video data.In video processing based on local invariant features, we put emphasis on video content representation by local features. (1) We proposed a novel spatial-temporal context based on spatial neighbor and temporal similiarity to enhance the description of local video features. (2) We proposed to describe local features in the spatial image pyramid in order to capture global configuration of different object parts. Both local global clues are used for object description. (3) We proposed a novel closed loop mapping feature matching method for video feature matching. At the same level of reliability, our method can obtain more matches.In moving object detection based on motion analysis, we proposed a novel moving object detection method based on multiview geometric constraints. The major contributions are: (1) We proposed a new multiview epipolar constraints based on consecutive positions of binocular camera with non-parallel configurations. It can be used for moving object detection when the object and the camera move in the same direction, where the epipolar geometry fails. (2) We proposed to detect and track multiple moving objects under the framework of particle filter so as to deal with the situation of multiple moving objects entering or leaving field of view.In moving object detection based on unsupervised statistical learning, we proposed a novel moving object detection algorithm based on dynamic topic discovery. Taking advantage of a robust representation of video by spatial-temporal context words and an unsupervised learning strategy, we proposed to use dynamic topic modeling to discover and localize moving objects in close range videos from moving platform. In essence, our model consists of two levels: (1) At the feature level, distinctive video patches which are robust to position, scale and lighting variations are extracted. These patches contain important information of different class of moving objects. (2) At the object level, structure and motion similarity across frames are used to build the object model.In moving object detection based on supervised statistical learning, we proposed a novel infrared pedestrian detection algorithm based on discriminative models, as an instance of specified category moving object detection. The major innovations are: (1) We proposed a novel feature centric sliding window region of interest extraction methods. It performs robustly under different sceniarios while greatly reduces the computation cost. (2) We proposed a novel pyramid binary pattern (PBP) feature for infrared person description. Our PBP feature combines both local texture and global shape information and it has been extended to 3D form for pedestrian description.

节点文献中: