节点文献

图像局部不变特征提取与匹配及应用研究

Study on the Detecting and Matching Technique of Local Invariant Feature and Its Applications

【作者】 张洁玉

【导师】 夏德深;

【作者基本信息】 南京理工大学 , 模式识别与智能系统, 2010, 博士

【摘要】 图像特征提取是图像分析、模式识别及计算机视觉等领域的一个重要研究内容,它是众多问题的研究基础。由于目标所在的图像之间大部分都存在旋转、视点、尺度、光照、模糊等变换,因此如何提取图像的稳定特征成为了相关领域的研究重点。近年来,一类局部不变特征由于其针对图像平移、旋转、尺度、光照及视点等变换具有不变性,已经在图像配准、图像拼接、物体识别、目标跟踪、数字水印及图像检索等方面得到了很广泛地应用。基于局部不变特征的方法主要步骤有特征提取(包括特征检测与特征描述)和特征匹配。本文深入分析了不变特征相关理论基础,研究了已有的一些局部不变特征提取及匹配方法,针对这些方法存在的问题作了相应的改进,并将改进后的新方法应用到了图像配准、物体识别等领域,获得了较好的效果。研究了基于尺度空间理论的多尺度特征点检测方法,分析了该类方法存在的问题,并基于Harris角点提出了一种新的多尺度特征点检测方法。新方法首先在尺度空间中的每个尺度内检测Harris特征点,遍历所有尺度以跟踪全部的Harris点,并同时将其分组,使每组仅代表一个局部结构。然后筛选每个组内的特征点,选取使角点度量值和尺度归一化Laplace函数同时达到极值的点代表该局部结构。最后利用SIFT描述符对特征点进行描述和匹配。由实验结果可知,对于包含尺度、视角、JPEG压缩及模糊变化的图像,新方法比原Harris-Laplace方法检测的特征点具有更高的重复率。并且对于两个谱段的遥感图像配准,新方法能够得到比原Harris-Laplace方法更高的配准精度。尺度不变特征变换(Scale Invariant Feature Transform, SIFT)是一种广泛应用的特征点描述符。该描述符仅利用了特征点的局部邻域梯度信息,当图像中包含多个相似的局部结构时,SIFT描述符使散落在相似局部结构中的点极易发生误匹配。针对这一问题,本文提出了一种基于空间分布描述符的SIFT误匹配校正方法。该方法首先利用SIFT算法进行特征点提取与匹配,然后对于每一个匹配结果中的特征点,再利用图像边缘像素点对该点的空间分布信息重新描述,形成一种独特性更高的空间分布描述符,最后采用该描述符针对匹配结果中的两种误匹配进行校正。实验结果表明,与随机抽样一致性法(Random sample consensus, RANSAC)相比,利用空间分布描述符剔除更多误匹配的同时,也能够保留更多原本正确的匹配,具有一定的实用价值。同时,本文分析了一种仿射不变特征提取方法,即多尺度自卷积法(Multi-scale Auto-convolution, MSA),基于该方法提出了一种多尺度自卷积熵(Multi-scale Auto-convolution Entropy, MSAE)的仿射不变特征提取方法。首先,基于MSA特征构造了MSAE特征,并证明了MSAE特征具有仿射不变性;再利用广义典型相关分析(Generalized Canonical Correlation Analysis, GCCA)将MSA和MSAE进行特征融合,得到新的包含图像更多信息的组合仿射不变特征,将该组合特征和MSA特征作为描述符,均分别对整幅图像及图像中检测得到的最稳定极值区域(Maximally Stable Extremal Region, MSER)进行描述,并对描述后的整幅图像和MSER区域分别进行了分类识别实验,证明了新的组合仿射不变特征描述符比MSA特征描述符具有更高的独特性。估计对极几何的约束关系是剔除误匹配的主流方法,其中M.估计法(M-Estimators)具有相对较快的计算速度及对高斯噪声的稳定性,因此具有很好的应用前景。但该类方法完全依赖由线性最小二乘法估计得到的矩阵初始值,精度较低,稳定性较差。基于此,本文提出了一种改进的M-Estimators算法。首先利用7点法计算得到基础矩阵的初始值,再将匹配点与对应极线的对极距离平方和作为度量,计算求得较原M-Estimators算法更加精确的矩阵初始值,然后利用此初始值剔除掉原匹配点集中的错误匹配点及坏点,最后运用Torr M-Estimators法对此时的匹配点集进行非线性优化计算,得到了最终的匹配点对。实验结果表明,与M-Estimators和Torr-M-Estimators相比,改进的M-Estimators法在误匹配以及高斯噪声存在的情况下,不仅提高了基础矩阵的估计精度而且具有很好的鲁棒性。最后,在研究Mean Shift算法及相关的Camshift (Continuously Adaptive Mean Shift)算法的原理并分析Camshift算法缺点的基础上,本文提出了一种结合SIFT和Camshift的目标跟踪方法。该方法中,首先将目标区域转到HSV颜色空间,采用SIFT算法分别提取目标区域H、S和V三通道的特征点,当目标区域中的背景纹理不复杂时,SIFT特征点大部分落于目标之上,再利用这些特征点统计目标的色度直方图,然后根据该直方图计算下一帧图像的目标颜色概率分布图,接着采用SIFT算法提取搜索区域H、S和V三通道的特征点,并将其与目标区域的特征点进行匹配,最后利用匹配后搜索区域特征点围成的图像块中的像素点计算下一次搜索窗口的中心位置及窗口的尺度变化因子。通过包含室内和室外测试环境的3组序列图像的跟踪实验,证明了新方法的有效性。

【Abstract】 Image feature extraction, which serves as a key for many areas of image processing, plays an important role in image analysis, pattern recognition and computer vision. Since the images are always suffered from a series of transformations, such as rotation, viewpoint, scale, lightness, blur and so on, the issue that how to detect stable features becomes an emphasis in related research fields. In recent years, a kind of local features, invariant to a class of image transformations, has been proved to be successful in a wide range of applications, such as image registration, image stitching, object recognition, target tracking, watermarking, image retrieval and so on. The kind of methods based on local invariant features mainly consists of feature extraction (including feature detection and description) and feature matching. In this paper, some theories about invariant features were analyzed thoroughly and some existing methods of detecting and matching local invariant features have been studied. Then several novel algorithms based on original ones with better performance in image registration, object recognition and tarket tracking have been proposed.A kind of methods about multi-scale feature extraction based on scale-space theory has been studied and some shortcomings of these methods were analyzed. On the basis of the analysis, a feature detector named improved Harris-Laplace is proposed to obtain higher repeatability than that of original Harris-Laplace. In this novel method, the Harris feature points in each scale are extracted respectively first, and all points detected in each scale are tracked and grouped beginning with the largest scale in the scale-space to make each group represent one local structure. Then the point in each group which simultaneously leads to the maxima of corner points measuring and scale normalization Laplace function is selected. Finally, these points are described and matched by SIFT descriptor successfully. To some image with the transformations, such as scale, viewpoint, JPEG compression and blur, experimental results indicate that the proposed method has higher repeatability than original Harris-Laplace. Moreover, comparing with original Harris-Laplace, a more accurate registration precision of multi-sensor remote sensing images was obtained by the advanced method.Scale Invariant Feature Transform (SIFT) is a widely used descriptor for local invariant feature. However, since this descriptor uses the gradient information in the neighborhood of one feature point, some mismatches may appear when the extracted feature points locate in some similar structures of one image. So a novel method based on a kind of spatial distribution descriptor is proposed to correct the mismatches caused by SIFT. In the proposed method, the feature points were detected and matched first by SIFT and then each matched point can be described again to generate a more distinctive descriptor using the spatial distribution of the pixels on the image contour to the matched point. Finally, two kinds of mismatchs were corrected by the new descriptor. The experimental results indicate that, comparing with the Random sample consensus (RANSAC), the proposed algorithm shows the ability to exclude more false matches while retain more of the original correct matches.Meanwhile, a new local affine invariant feature descriptor is proposed. First, a new kind of feature named as Multi-scale Auto-convolution Entropy (MSAE) is constructed based on MSA and proved to be affine invariant. Then the MSA is combined with MSAE using the Generalized Canonical Correlation Analysis (GCCA) to obtain a new feature with more information. This combined feature can be seen as a new local affine invariant feature descriptor. Finally, the whole image and the Maximally Stable Extremal Region (MSER) extracted from the image are described by the new descriptor, respectively. Two recognition experiments verify that the proposed combined affine invariant feature is more distinctive than MSA.Furthermore, a kind of algorithm, based on epipolar geometry constraint, now is known as a mainstream method for discarding mismatches. Among them, M-Estimators, with fast computation speed and robustness to Gaussian noise, has good application prospects in discarding mismatches. Because this algorithm depends entirely on the primary matrix obtained by the method of least squares, its precision and stability of detection is not very well. Then an improved M-Estimators algorithm for estimating the fundamental matrix was studied. The improved method calculates the primary matrix by seven-point technique first. Then the quadratic sum of the distances between the matching points and the corresponding epipolar lines is set as a metric to calculate a more precise initial fundamental matrix than M-Estimators. In the following step, this obtained initial matrix is used to eliminate the mismatches included in the original point set. Finally, a nonlinear optimization for the new matched points set is carried out with Torr-M-Estimators and some finally matched point pairs are obtained. Through a mass of experiments performed in the case of mismatches and Gaussian noise, the experimental results indicate that the proposed algorithm not only improves the estimation precision but also shows a well robustness, comparing with M-Estimators and Torr-M-Estimators.In the last part, two algorithms, Mean Shift and its derivative Camshift, have been researched and a new target tracking method combining SIFT and Camshift is proposed to overcome the shortcomings of Camshift. In the first step, the target region is transformed to HSV color space, then the feature points are extracted from H, S and V channels by SIFT algorithm respectively. All SIFT points, most of which are located on targets in view of the weak texture of target background, can be used to generate hue histogram. Secondly, color probability distribution of the next frame image is obtained based on the hue histogram. Thirdly, SIFT algorithm is used to detect points in the searching region from H, S and V and then these points are matched with those points in the target region. Finally, the pixels within the region obtained by the matched points located in searching region can be used to calculate the new center and size of the searching region. Three image sequences including testing environment indoors and outdoors are used to evaluate the propose method. Experimental results indicate that the new method has better validity.

  • 【分类号】TP391.41
  • 【被引频次】32
  • 【下载频次】4702
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络