节点文献

面向汽车辅助驾驶的远红外行人检测关键技术研究

Research on Important Issues of Far-infrared Pedestrian Detection for Automotive Driver Assistance Systems

【作者】 庄家俊

【导师】 刘琼;

【作者基本信息】 华南理工大学 , 计算机应用技术, 2013, 博士

【摘要】 基于远红外成像的行人检测已成为计算机视觉与模式识别领域的研究热点之一,远红外图像不依赖于场景的光照条件,反映的是具有不同热辐射率的场景目标的表面温度分布,能够捕捉黑暗和烟雾等环境中的行人目标,故基于远红外成像的夜间行人检测在汽车辅助驾驶系统和智能监控系统中具有重要的应用前景。由于行人目标的非刚体性质和较强的运动随意性,其外观模式通常呈现出复杂多变和尺度不一等特点,导致其具有较高的类内多样性;与可见光图像相比,远红外图像中的行人目标还具有纹理信息较少、分辨率较低的特点,因此,基于远红外成像的行人检测是一个极具挑战性的研究课题。本文围绕面向汽车辅助驾驶的夜间行人检测问题展开研究,基于配备单目摄像头的车载平台,研究解决保障检测系统实时性、准确性、适合于变化场景的行人检测问题,涉及候选区域(Regions of Interest, ROIs)提取方法、远红外行人描述特征的提取方法、行人识别方法等方面关键技术的研究。论文的主要贡献如下:1)提出一种基于概率模板匹配的远红外行人检测方法。根据行人的运动朝向建立多尺度概率模板,缓解因外观模式引起的行人类内方差较大的问题;进一步将目标跟踪/多帧校验方法融入概率模板匹配的过程,综合利用行人在多帧图像之间出现的连续性和检测结果的一致性滤除不稳定的误检模式,并填补部分由于ROIs提取精度不高所造成的漏检行人。实验表明该方法能够较好地保障检测系统的实时性,与基于行人步态模式的概率模板构建方法相比,文中方法归纳行人外观模式的能力更强。2)基于统计学习的识别框架,提出熵加权方向梯度直方图特征(Entropy WeightedHistograms of Oriented Gradients, EWHOG)描述远红外行人,综合了所描述目标的局部形状信息和局部梯度分布的随机信息,确保其局部形状能够更好地被局部密集像素梯度或边缘方向表示;为了解决因成像尺度不一等因素引起的目标类内方差较大的问题,提出基于EWHOG特征的三分支结构支持向量机(Support Vector Machine, SVM)行人识别方法,并利用快速分类支持向量机(Fast Classification Support Vector Machine,FCSVM)对获得的支持向量进行优化,从而约简识别环节所需要的计算和存储开销;根据远红外行人头部及其与周围背景之间灰度分布的差异性,提出进一步抑制误检目标的行人头部校验方法。实验表明:EWHOG特征能有效区分远红外行人;快速分类方案以轻微降低行人识别准确率为代价,保证检测系统运行的实时性,在市区和郊区场景中均获取了较好的检测性能。3)针对行人检测本质上属于“稀有事件检测”问题的特点,从ROIs提取的角度出发,提出一种基于像素梯度的垂直投影方法,根据远红外图像中天空与路面等背景区域通常具有大范围高灰度同质性的特点,利用图像梯度信息对可能包含行人的竖直带状图像区域进行初定位,避免对整幅输入图像进行搜索;实验表明该方法能够提高ROIs提取阶段的搜索效率,并能够抑制部分仅包含背景目标的候选区域。在行人识别阶段,将图像空间金字塔表示方法融入EWHOG特征的提取过程,在多层图像片(cell)划分方式下,利用局部方向梯度直方图的熵加权分布特性及其全局结构信息表征远红外行人,提出了金字塔熵加权方向梯度直方图(Pyramid Entropy Weighted Histograms of OrientedGradients, PEWHOG)特征;鉴于PEWHOG特征属于直方图统计特征,利用基于直方图交叉核(Histogram Intersection Kernel, HIK)的SVM分类器实现行人识别;针对收集具有代表性的训练数据较为困难、行人分类器的预测性能依赖于初始训练数据的问题,提出基于bootstrapping和提前终止策略的离线训练机制。4)训练数据与测试数据之间通常存在不可避免的数据分布差异性,这可能导致多数基于传统统计学习的行人检测方法在场景因素变化较大时表现得不够理想。针对这个问题,提出一种Boosting-style的归纳迁移学习算法DTLBoost,用于高效且有效地处理变化场景中的远红外行人检测问题。显式地定义成员分类器对训练数据的预测不一致性程度,将其融入DTLBoost算法的样本权重更新规则,从辅助训练数据中选择具有正迁移能力的数据,并鼓励不同成员分类器学习目标训练数据的不同部分或方面。最后在本文收集的数据集和OSU远红外行人数据集上,通过面向行人识别和行人检测问题的扩展实验评估了该方法的性能,实验结果表明该方法在新场景和变化视角场景中均具有较好的行人检测性能。

【Abstract】 Pedestrian detection based on far-infrared (FIR) imageries has become a hot spot incomputer vision and pattern recognition community. FIR imageries capture the targets withdifferent distribution of surface temperature and thermal radiation emissivity and do notdepend on the illumination conditions, which makes it suitable to capture pedestrians indarkness and scenarios permeated with smoke. So it gains important potential in automotivedriver assistance systems and transportation video surveillance in night time scenarios. Thewide variety of possible appearances and scales of pedestrians caused by their non-rigidcharacteristic and high arbitrariness of motions usually leads to higher within-class variance.And compared with the imageries in visible spectrum, pedestrians in FIR imageries alsopresent as blur targets with lower resolution and less texture information. Therefore,pedestrian detection based on FIR imageries is a challenging task.This dissertation focuses on the issues of night time pedestrian detection for automotivedriver assistance systems uisng monocular FIR camera, aiming at (1) guaranteeing reliableperformance for automotive applications, with both real time implementation and highdetection accuracy;(2) dealing with pedestrian detection across unseen scenarios and newviewpoints. The main contents refer to the extraction of regions of interest (ROIs), featurerepresentation for FIR pedestrians and the framework of pedestrian recognition, which can besummarized as follows:1) A night time pedestrian detection method is proposed based on probabilistic templatematching, where the multi-scale probabilistic templates are established according to themoving directions of pedestrians and employ to recognize the potential pedestrians. Theprobabilistic templates alleviate the large within-class variability of pedestrians caused by thechanging appearance and thus improve the accuracy for describing appearance of pedestrians.Due to the characteristic of detection agreement of pedestrians among several successiveframes, an object tracking and multi-frame validation module is integrated in templatesmatching to suppress some false detection and fill the detection gap caused by the inaccurateextracted ROIs. The experimental results demonstrate that the proposed method meetsreal-time implementation criteria and the resulting probabilistic templates guarantees higheraccuracy for describing pedestrians’ appearance, compared to the ones based on gait patternsof pedestrians.2) Following a learing-based detection framework, we first propose entropy weightedhistograms of oriented gradients (EWHOG) to describe FIR pedestrians effectively. Considering both the information of local object shape and microdistributed chaotic degreesof local oriented graident distribution, EWHOG aims to pay more emphasis on thedistribution of local intensity gradients provided by local object shape. To reduce thewithin-class variance of objects located at different distances, a three-branch classifiercombining EWHOG features and supoort vector machine (SVM) is presented to recognizepedestrians. To reduce the computational and storage overhead, the resulting support vectorsare optimized using fast classification supoort vector machine (FCSVM). A further validationphase is then proposed to suppress some flase detection according to the intensity differencebetween FIR pedestrians’ heads and their adjacent regions. Experiments show that theproposed EWHOG is more approapriate to distinguish FIR pedestrians; the fast pedestrianrecognition framework guarantees higher implementation efficiency and the results in bothurban and suburban scenarios demonstrate its acceptable detection performance, at the cost ofonly slightly decrease of detection accuracy.3) Considering the rare-event-detection inherent in the tasks of pedestrian detectionwhere rare pedestrians need to be located from enormous background regions in the imagesequences, this dissertation proposes a pre-segmentation method called pixel-gradientoriented vertical projection to efficiently locate the vertical image stripes that probablycontain FIR pedestrians, which avoids the dense search within the whole input images. It isbased on the feature that the ground and sky in FIR images usually represent as largehomogeneous regions, which makes it possible to perform pixel-gradient oriented verticalprojection using the gradient information. Experimental results indicate that thepre-segmentation method significantly improves the speed of ROIs extraction and helps tofilter out some negative ROIs. In order to capture both the local object shape described by theentropy weighted distribution of oriented gradient histograms and its pyramid spatial layout, anovel pyramid entropy weighted histograms of oriented gradients (PEWHOG) is proposed todescribe FIR pedestrians. Then PEWHOG is fed to a three-branch structured SVM classifierusing histogram intersection kernel (HIK). An off-line training procedure combining both thebootstrapping and early-stopping strategy is proposed to generate a more robust classifier byexploiting hard negative samples iteratively, which also deals with the issue thatgeneralization ability of the resulting classifier depends on the initial training data.4) Under a traditional learning-based pedestrian detection framework, an FIR pedestrianclassifier trained by data extracted from one scenario may face difficulty in detectingpedestrians correctly in another distinct scenario due to the inevitable disparity in distributionsbetween the training data and test data. And it is expensive and sometimes difficult to label sufficient new training data from target domains to re-train a scenario-specific classifier. Tothis end, this dissertation proposes a novel Boosting-style algorithm for data-level transferlearning termed DTLBoost to detect FIR pedestrians towards distinct scenarios adaptationefficiently and effectively, which requires only a small amount of newly labeled training datafrom the target domains. To achieve better Boosting-style ensembles for inductive transferlearning, the degree of classification disagreement is formulated explicitly and incorporatedinto the weight updating rules of training samples. It helps to select the samples in auxiliarydata with positive transferability and encourage different base learners to learn different partsor aspects of target data. Extensive experiments including the performance evaluation of bothclassifier-level and system-level has been conducted to validate the effectiveness of theproposed method using our FIR pedestrian dataset and OSU thermal pedestrian dataset. Theresults demonstrate that the proposed method can impressively improve the detectionperformance across distinct scenarios, i.e. towards both new scenes and viewpointsadaptation.

  • 【分类号】TP391.41;TP274
  • 【被引频次】1
  • 【下载频次】616
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络