

Research on Object Tracking and Classification for Intelligent Video Surveillance

【作者】 李志华

【导师】 陈耀武;

【作者基本信息】 浙江大学 , 电子信息技术及仪器, 2008, 博士

【摘要】 随着视频监控需求的迅速增长、监控规模的日益扩大,人工监视已远远不能满足监控要求,视频监控系统的“智能化”变得越来越迫切。研究高实时性高准确度的智能视觉分析算法、多目视觉数据融合以及最优化问题是智能视频监控系统研究的关键。本文针对智能视频监控系统中的运动目标跟踪与分类问题——从运动目标分割、跟踪和分类,到基于多目视觉数据融合的目标连续跟踪、跟踪优化分配以及目标分类/跟踪互协作,进行了深入研究,并设计和实现了基于嵌入式CPU和DSP相互协作的智能摄像机原型系统。论文的主要研究内容概述如下:1.基于区域分割的复杂背景建模算法研究针对复杂监控场景,为了消除动态背景物、光照变化所产生的干扰以及无关的阴影区域,提出了基于区域分割的复杂背景建模算法。通过在色度和亮度空间对监控背景进行有效的区域分类,在变化小的稳定区域采用简单快速的自适应单高斯模型(Adaptive Single Gaussian Model),在变化大的动态区域采用计算复杂而有效的非参数化模型(Nonparametric Model)。该算法采用通用合并方法(GAS)聚类填充动态区域中的小空隙,并在边界处向外适当扩充像素,以提高区域分割对动态环境的适应性。在动态区域非参数化背景建模的训练阶段,采用双阀值顺序算法方案(TTSAS)把所有的背景采样值聚类成几个高斯分布类,以加速新采样值的核密度计算;2.基于模型动态切换的运动目标实时跟踪算法研究针对拥挤的监控场景,为了实现遮挡状态下目标跟踪的稳定性,提出基于模型动态切换的运动目标实时跟踪算法。通过对目标遮挡状态的有效判定,对未遮挡的单运动目标采用基于区域跟踪的简单快速模型,对相互遮挡的复合运动目标采用基于窄基线SIFT特征匹配的跟踪模型。由于被跟踪目标在相邻图像帧之间尺度和外形变化很小以及基于目标位置预测出的运动范围有限,SIFT特征匹配模型实现了快速的窄基线小范围特征匹配,达到了遮挡状态下目标跟踪的稳定性;3.基于多目视觉的目标连续跟踪及跟踪优化研究针对广域监控场景,提出一种基于多目视觉的目标连续跟踪及跟踪优化方法。利用摄像机背景图像之间的SIFT特征匹配自动检测摄像机之间的重叠视域,并根据相匹配的SIFT关键点计算重叠视域之间单应性变换矩阵的系数,SIFT特征匹配和单应性变换使跟踪系统达到了稳定的连续跟踪。为了在目标跟踪过程中获得更好的跟踪效果,该方法通过基于多目视觉的跟踪优化算法对目标的跟踪优先级和目标在各个摄像机中的遮挡状态及其分割图像大小进行数据加权融合,优先分配高优先级目标给具有最佳权值的摄像机进行跟踪,并动态平衡各个摄像机的计算资源与跟踪负载。该方法不需要具备摄像机校正和场景建模条件,适用范围较广;4.运动目标分类算法研究针对交通监控场景特点,提出基于分区归一化加权特征的目标分类算法。通过提取简单有效的运动特征和外形特征,并对交通监控场景中不同交通方向的道路区域和不同的场景位置进行分区,以提高目标特征的可区分性。在分区后应用AdaBoost方法评估各个特征的相对重要性,赋给每个特征一个归一化权值,最后生成一个线形分类器。由于遮挡状态会严重影响目标分类算法的性能,通过基于重叠摄像机数据融合的目标分类与跟踪互协作改善了拥塞场景中目标分类算法的正确率。借助于重叠摄像机之间不同的视角方向,利用摄像机之间的视点对应和数据融合决定最优的分类与跟踪结果,提高了遮挡状态下运动目标分类的准确度和目标跟踪的稳定性。在以上研究基础之上,本论文最后针对智能视频监控系统的核心功能单元——嵌入式智能摄像机系统,设计一种新型的实时嵌入式智能摄像机系统,该方案基于嵌入式CPU和DSP协作的硬件体系结构,实现智能视觉分析和网络交互模块并行处理的软件系统框架,提供了一个较好的智能摄像机原型设计实例。

【Abstract】 With the rapid growth of video surveillance requirements and the extension ofsurveillance area day by day,manual monitoring can’t meet the needs of surveillanceand the“intelligentization”of video surveillance systems gets more and more urgent.Study on highly real-time,precise and intelligent video analysis algorithms as well asmulti-camera data fusion and optimization problems are the key of intelligent videosurveillance systems.The relative techniques of object tracking and classification inintelligent video surveillance system are studied in the thesis,which includes motiondetection,object classification and tracking as well as the object continuous tracking,tracking optimization and object classification/tracking cooperation based onmulti-camera data fusion.The thesis implements an intelligent camera systemprototype based on the cooperation between embedded CPU and DSP.The maincontents are summarized as follows:1.Background modeling algorithm based on region segmentationAiming at complex surveillance scenes,a background modeling algorithm basedon region segmentation is proposed.The surveillance background is effectivelyclassified in the chromaticity and lightness space.The adaptive single Gaussian modelis used in the stable region with gradual changes and the nonparametric model is usedin the dynamic region with jumping changes.A Generalized Agglomerative Scheme isused to merge the pixels in the Variable Region and fill the small interspaces.ATwo-Threshold Sequential Algorithmic Scheme is used to group the backgroundsamples of the Variable Region into distinct Gaussian distributions.The kernel densitycomputation complexity is largely reduced by arranging the computation order of thesegroups according to their proximity in mean value to the current pixel sample beingestimated;2.Moving object tracking based on dynamically switching modelsAiming at congested surveillance scenes,a moving object tracking method basedon dynamically switching models is proposed.After effectively estimating theocclusion state of moving objects,a simple region-based tracking model is deployedfor non-occluded moving objects and a narrow-scale image matching model based onSIFT(Scale Invariant Feature Transform) features is deployed for occluded movingobjects.Because the tracked targets only have very small scale changes and limitedrange on target region position prediction between neighboring image frames,theimage matching model based on SIFT features can also implement rapid SIFT featuresmatching and robust target tracking in complex occlusion scenes; 3.Tracking target continuously and optimization based on multiple camerasAiming at wide area video surveillance scenes,a method to track targetcontinuously and optimization based on multiple cameras is proposed.The centralserver detects the overlapping area between the field of views (FOV) of the cameras byfeatures matching of the background images through Scale Invariant FeatureTransform(SIFT).Viewpoint correspondence between the overlapping cameras is thenestablished by using the ground plane homography transformation matrix throughSIFT keypoints matched between the background images.SIFT feature matching andhomography transformation achieve stable continuous target tracking.In order toeffectively track moving targets among multiple cameras,the method prioritizes thefusion process based on assigned priority,the occlusion state and image segmentationsize of the moving target by use of a tracking optimization algorithm based onmulti-camera data fusion is proposed.This optimization algorithm preferably allocatethe optimal camera to track moving targets with high priority.The algorithm alsodynamically balances the tracking load of each camera.The method doesn’t needcamera calibration and environment modeling and its application area is very wide.4.Research on object classification algorithmsAiming at traffic surveillance scenes,an object classification algorithm based onnormalized weight features of the partitions is proposed.The classification algorithmextracts effective motion-features and shape-features,and the discriminability of theobject features is improved by partitioning the surveillance secens according todifferent traffic road boundaries and scene locations.After the partition,AdaBoostmethod is utilized to evaluate the relative importance of each feature and assigns anormalized weight to each feature.Because the occluded state can reduce the accuracyof the classification algorithm,the classification accuracy in congested scenes isimproved by an improved object classification and tracking method based on multipleoverlapping cameras cooperation is proposed.The optimized classification andtracking results are determined by the viewpoint correspondence and data fusionamong the overlapping cameras.By use of the different view directions of theoverlapping cameras,object classification accuracy and tracking stabilization incomplex occlusion scenes are improved.Based on the above research,a novel and high-powered intelligent camera systemis designed aiming at the kernel unit of intelligent video surveillance systemsembedded intelligent camera.The scheme is based on the cooperative hardwarearchitecture between embedded CPU and DSP,and implements parallel softwareframework of intelligent video analysis and network interaction module.The systemprovides a prototype design of intelligent camera.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2009年 11期
  • 【分类号】TP391.41
  • 【被引频次】25
  • 【下载频次】3356
  • 攻读期成果