节点文献

多目标的图像检测

Detection of Multi-target Image

【作者】 徐来

【导师】 陈庆章; 周德龙;

【作者基本信息】 浙江工业大学 , 计算机应用技术, 2010, 硕士

【副题名】人脸人眼检测

【摘要】 随着模式识别和人工智能的高速发展,多目标图像检测技术得到国内外社会各界的广泛的关注和深入研究。图像检测识别在科技领域和安全领域方面上都具有很强大的发展前景和广大的市场潜在的经济价值。同时,在人们的社会生活中,还是在市场商用领域,多目标图像检测也都发挥着重要的作用。但随着社会的进步和技术的更新发展,人们对图像检测识别技术的要求也越来越高。检测率的高低和实时性的好坏是图像检测识别技术的最主要的两个性能指标,也是该技术目前所要研究突破的方向。由于图像检测技术一直受限于目标形态的多样性,目标的遮挡问题以及背景复杂度或外部环境等诸多影响,从而对降低了检测速度和精度。本文就针对提高图像目标检测系统的整体综合性能,并实现多个目标同时检测做了以下主要工作和成果:1.通过DirectShow构建视频采集系统,作为图像目标检测的图像采集模块,并有效的结合WDM视频捕捉,共同协作完成对构成视频图像的预览模型。由于Directshow能让复杂的数据流在不同硬件上同步传输变得简单有效,保证了后续图像检测的实时性。2.为了提高训练的速度,解决传统Adaboost训练计算量大、时间长的问题,提高检测的精度。在Adaboost弱分类器训练时,采用了特征选取时确定阈值搜索范围,减少搜索的时间,提高阈值的最优化,使得减少了弱分类的个数,最终提高Adaboost训练的训练速度。3.为了满足动态的图像检测实时性的需要,本文结合了分层思想和半像素匹配的快速搜索算法对图像进行实时精确的匹配,大大的减少了检测时搜索匹配的计算量和时间,提高了匹配的精度。同时,研究了小波变换理论,根据小波各频域表征的特性,在低频区域采用加权平均法,在高频区域采用基于区域的方法。4.通过Adaboost算法分类器的训练对多姿态不同的人脸进行检测,并在检测到的人脸区域再对人眼进行粗定位,这里分别应用了Gabor小波变换的人眼定位、DCT变换人眼模板匹配的人眼定位以及adaboost训练分类器的人眼定位方法,并进行了简单的比较分析各自的优缺点及算法性能。然后再用二值化处理,积分投影变换,并结合人眼的几何知识的方法对人眼进行精确的定位并标注。由于,经过人脸检测后,图像进行了有效的归一化,已大大的缩小了人眼的检测范围,再通过图像处理,从而有效的缓解了光照的影响,以及多种方法的融合,能快速准确的锁定瞳孔位置,综合的提高了检测系统的性能。实验结果表明,系统的整体检测效果与传统的检测方法相比,有了较明显的提高。但仍存在一些弊端,如旋转角度大的目标,检测效果就大大的降低了。如何校正旋转目标的匹配问题,需要进一步的研究改进。

【Abstract】 With rapid development of pattern recognition and artificial intelligence, both domestic and international researchers pay a lot of attention into the technology of multi-target image detection. Image detection and recognition has powerful prospects and huge potential economic value in area of science and security. At the same time, multi-target image detection technology plays an important role both in people’s life and business activities.As the proceeding of society and technology, people propose higher demand on image detection and recognition technology. The real-time performance and detection rate are two primary measure standards, and further directions of this technology. However, some problems, such as diversity and block of the target, complex background, external environment and so on, become the bottleneck of the Image detection technology, all of them pull down the speed and accuracy of the final result. The main purpose of this paper is to improve the overall performance of the image detection system, and realize multi-target detection. Brief work and results are shown as follows:1. Using DirectShow to architecture video capture system, as a modle for capturing target image, Combining with WDM video capture to establish the complete video preview model. The DirectShow could make complex data stream transport easy and effective on multi-hardware platforms, it can ensure the real-time performance of the following detecting operations.2. To improve training speed and the precision of detection result, resolving the problem of large computation and spending long time in traditional AdaBoost method, we use a definite threshold range in feature selection, when training the weak classifier, so as to reduce search time, to improve threshold optimization, to reduce the number of weak classifier, and finally to improve the training speed.3. To satisfy the demands of real-time dynamic image detection, this passage combined stratification method with half pixel match algorithm to match the image accuracy and real-time performance, and largely reducing the match time while performing detection. Meanwhile, we study the theory of wavelet transform. According to the feature of frequency domain characterization in wavelet, we use weighted average method into the low-frequency region, and the method based on region into the high-frequency region.4. Using Adaboost algorithm to train and detect human face in multi-postures, and then obtain the position eyes based on the region of face. Here we used Gabor wavelet transform, DCT transform and adaboost training classifier to locate human eyes. And conduct a simple comparative analysis of their strengths, weaknesses and algorithm performance. Then we perform binarization and integral projection transformation, according to geometry knowledge of eyes, to locate the eyes and mark them. Because of the face detection, the image has normalized effectively, narrowing the range of search area into a smaller range. Processing the image again, the impact of light is relieved effectively. We integrate a variety of methods, and the system can locate eyes quickly with a improved performance of the detection.Comparing to the traditional method, the Experiment results shows that the performance of this system significantly improved. While there are also some problems, for example, detection result will fall down when the target has a big rotation. The problem of matching rotation target needs further research.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络