

Visual Tracking Algorithm Based on Feature-Learning and Feature-Imagination

【作者】 徐萧萧

【导师】 陈宗海;

【作者基本信息】 中国科学技术大学 , 控制科学与工程, 2010, 博士

【摘要】 基于视觉的目标跟踪是计算机视觉领域中的热门课题和难题。它旨在采用计算机来跟踪视频中的运动目标。在智能视频监控、图像压缩、医疗诊断等方面,视觉跟踪具有广阔的研究意义与应用前景。目前,尽管实现目标跟踪的方法众多,但仍然有许多问题有待解决。本文以基于人类视觉认知进行视觉跟踪算法设计为基本出发点,模拟人类视觉在识别跟踪目标过程中的特征学习和特征联想性,对人类视觉进行目标识别与跟踪的过程进行全面的分析;提出视觉跟踪中特征学习和特征联想的概念;建立完整的基于特征学习与特征联想(Feature-Learning and Feature-Imagination,以下简称FLFI)的视觉跟踪算法的一般理论体系,并尝试将其应用于人体跟踪。基于FLFI的视觉跟踪算法在传统算法中融合人类视觉的思维方式,打破了当前视觉跟踪算法设计的思维定式,有着广泛的理论和应用前景。本文的主要工作和贡献如下:1.从认知科学和认知心理学的角度分析了人类视觉的智能特点,探讨了人类视觉系统的目标跟踪模式,阐述了人类视觉中注意性、学习性、记忆性和联想性的思维特点,在此基础上给出了基于FLFI的视觉跟踪框架。该框架在传统的视觉跟踪框架中引入了人类视觉的学习和联想的思维特点。2.结合人类视觉系统的智能特点,给出了变姿态目标的特征表达方法,该方法在传统的基于向量的特征表达方式中引入了状态空间的概念。我们采用动态加权更新的方法实时学习变姿态目标在不同状态下的特征,然后在此基础上进一步提出了一种基于人类视觉特征联想特性的视觉跟踪模型,并给出了通用的推理和模型参数的训练方法。3.结合当前人体跟踪的研究现状和技术方法,对基于FLFI的视觉跟踪框架进行了简化,给出了一种基于FLFI的人体跟踪方法。该方法运用特征学习提取目标初始时的特征,通过特征匹配判断目标是否被遮挡。在目标被遮挡后,则利用基于特征联想的匹配方法恢复对目标的跟踪。这种方法只需要在初始时指定目标的状态,在跟踪过程中无需进行人体姿态识别及遮挡过程中的目标定位。4.为了将基于FLFI的视觉跟踪框架更好的应用于人体跟踪,本文对视觉跟踪中的人体姿态识别和遮挡问题进行了深入的研究和讨论。在人体姿态识别中,提出了一种基于人体头肩分割的人体位姿估计算法。该算法针对直立行走的人体,将人体位姿分为6个状态,利用人体在2D成像时的规律和特点,估计人体位姿。对于遮挡问题,则采用直方图匹配和基于分块的局部特征匹配相结合的方法来处理。最后,本文介绍了实验室视觉小组成员合作研发的智能视频监控系统。

【Abstract】 Vision-based object tracking is an active and challenging research topic in the field of computer vision. It focuses on tracking moving objects in the videos by using the computer. Visual tracking has promising research significance and applications in many fields such as intelligent surveillance, image compression and medical diagnosis. Currently, although many methods can achieve object tracking, but there are still many issues to be resolved.This dissertation discusses the algorithms of visual tracking through human visual perception. We analyze the process of human visual tracking by simulating the human visual characteristics of feature-learning and feature-imagination, proposes the concept of feature-learning and feature-imagination in visual tracking. We establish a complete general theoretical system based on feature-learning and feature-imagination (FLFI), which is used to track human body in this dissertation. The visual tracking algorithm based on FLFI integrates the thinking-way of human vision with the traditional visual tracking methods, breaks the current mind-set of visual tracking algorithm designing and has a bright future in both theories and applications. The main tasks and contributions of this thesis are:1. Analyze the human vision intelligence through cognitive science and cognitive psychology. Discuss the object tracking model of human vision, and describe the thinking characteristics of human vision such as attention, learning, memory and imagination. Furthermore, we propose a visual tracking architecture based on FLFI, which integrates learning and imagination with the traditional vision tracking methods2. A feature representation method is given by considering the human vision intelligence. In this method, the concept of state space is led into the traditional vector-based feature representation methods. We propose a method to learn the features of variable-pose object in different states by using dynamic weighting update methods. After that, we give a visual tracking model based on feature- imagination and introduce its general inference and learning methods.3. Based on the current research status and the existing technical methods of human tracking, this dissertation presents a method of real-time human tracking based on FLFI. This method extracts object features by feature-learning at the beginning of tracking, determines occlusions by feature matching and restores object tracking by feature-imagination. Furthermore, this method does not need human pose recognition and object locating under occlusion situations in the process of tracking. It only needs to appoint an initial object state at the beginning of tracking.4. In order to apply the FLFI visual tracking architecture to human tracking more perfectly, this dissertation also makes some researches on human pose recognition and object tracking under occlusions. In the study of human pose recognition, a human pose estimation algorithm based on human head-shoulder segmentation is given. Aimed at upright walking human, this algorithm divides human pose into six states and estimates human pose through the characteristics of 2D imaging of human. To solve the occlusion problem, we propose a method based on combination of histogram matching and local feature matching.At last, we introduce an intelligent surveillance system evolved by members in the vision group of our lab.


