节点文献

图像局部不变特征提取技术研究及其应用

Research on the Local Invariant Feature Extraction of Images and Its Application

【作者】 孙晶

【导师】 褚金奎; 邢英杰;

【作者基本信息】 大连理工大学 , 精密仪器及机械, 2009, 博士

【摘要】 目标上局部不变特征的提取是图像处理、数字水印、拷贝检测、视频检索等众多计算机视觉领域的研究基础。由于大部分目标之间均存在视角、尺度、旋转以及模糊、局部遮挡、复杂背景等广义仿射变换,因此如何使提取的局部不变特征具有良好的稳定性、可重复性和可匹配性成为视觉领域的一个重点研究问题。针对原始最大稳定极值区域算法的时间复杂度高、区域被覆盖以及形状不规则等缺点,构建了一种并行使用优化的邻域四叉树数据结构、基于成分树的最大稳定判定条件和基于向量的二阶中心矩调形通式的检测子,提取出加速的椭圆形最大稳定极值区域(Elliptical Maximally Stable Extremal Regions-Accelerative,EMSER-A)。在图像像素点排序的基础之上,根据使用按秩合并和路径压缩优化的邻域四叉树数据结构提取基于灰度阈值变化的极值区域,有效恢复了最终变成一个像素值和灰度阈值的区域的全部信息。使用极值区域作为结点构建成分树,并得到最大稳定判定条件。为了便于后续的特征描述,构建了基于向量的二阶中心矩调形通式,并将该通式降维为二维协方差矩阵,把不规则形状区域调整为椭圆形。EMSER-A检测子在保证可重复性的前提下,将时间复杂度由O(nloglogn)降低到O(Na(N))。特征提取和匹配实验表明:当存在广义仿射变换时,该检测子仍能提取数量较多且具有一定独特性的局部特征。可重复性比较实验证明:EMSER-A具有较为理想的视角、尺度、旋转、光照和模糊不变性。为了解决特征点的定位精度不高、区域仿射形状的难以调整以及特征的弱仿射不变性等问题,论文还构建了基于点区域的仿射不变性检测子。证明了规范化LoG图像导数的尺度不变性,为特征尺度的确定提供了理论基础。提出了特征尺度的定性和定量定义,完善和发展了特征尺度的性质。系统地证明了仿射高斯尺度空间内形状自适应矩阵的仿射不变性以及基于该矩阵的规范化区域之间的旋转关系。给出了迭代过程中积分尺度和微分尺度的确定准则以及微分尺度对提高检测子抗噪性的影响。建立了基于仿射形状自适应矩阵的空间位置迭代矩阵,并使用该矩阵完成了从规范化区域到图像域的转换。在上述理论支撑下,构建了联合规范化LoG方程、多尺度Harris度量、仿射形状自适应矩阵以及空间位置迭代矩阵的仿射不变性检测子(Location/Scale/Shape-Iterative,LS~2-I),同步迭代出点区域的空间位置、特征尺度以及仿射邻域形状。特征提取和匹配实验表明:在存在广义仿射变换的同一个场景的两幅图像上提取特征区域,经过位置、尺度和形状的同步迭代,收敛的特征点邻域形状在规范化图像域内显示一致,实现了对特征点的精确定位和特征区域形状的仿射调整,并取得了理想的匹配结果。可重复性比较实验证明:在视角、尺度、旋转、光照变化以及存在模糊的情况下,LS~2-I具有较为理想的可重复性。为了验证EMSER-A和LS~2-I的可匹配性及稳定性,论文构建了相应的图像检索机制。论文首先以EMSER-A和LS~2-I为底层局部特征区域,生成SIFT描述子并聚类成基于矢量量化的视觉关键词表,结合标准加权思想和衡量不变特征的相似性准则,提出了基于框选目标区域的检索方法,依据相似度得分对检索结果进行一次排序。为了更有效的发挥特征区域的不变性,论文提出了基于搜索单元区域匹配法和基于簇的空间一致性度量准则。前者根据EMSER-A和LS~2-I椭圆形特征区域的仿射协变性,在目标区域和被检图像上分别以某两个已匹配的椭圆形特征区域作为搜索单元,在该搜索单元内进行基于原匹配和新匹配之间关系的区域匹配,并删除零分匹配区域,最后根据得分情况对图像进行二次排序。后者根据正确匹配对空间角之间的关系,提出一种有效去除误匹配的基于簇的空间一致性滤波方法。论文构建了四种不同的检索机制,检索实验证明了特征区域EMSER-A和LS~2-I的可匹配性及基于目标区域和空间一致性检索机制的正确性。同时,增大图像库容量的对比实验和广义检索实验证明了特征区域及检索机制的稳定性和正确性。

【Abstract】 The extraction of local invariant feature in the target is the research fundament in computer vision field,for example image processing,digital watermarking,copy detection and video retrieval.Because there are generalized affine transformations,such as viewpoint, scale,rotation,blur,partial occlusion and complex background,among most of targets,when generalized affine transformation occurred,it is concentrated in vision field to make the local invariant feature of good stability,repeatability and matching.With respect to the shortcomings of high time complexity,regions overlapping and shape irregularity for original Maximally Stable Extremal Regions detector,the dissertation constructs a detector in parallel use of optimization neighborhood quadtree data structures, maximally stability criterion based on component tree and shape modification general expression with second order central moment based on vector;and then extracts Elliptical Maximally Stable Extremal Regions-Accelerative(EMSER-A).Based on pixel ordering, optimization neighborhood quadtree with path compression and union by rank are used to extract extremal regions based on intensity thresholds changing,which efficiently restore all of the information in the region that finally turned into one pixel and one intensity threshold. Extremal regions are used as nods to construct component tree,and maximally stability criterion is obtained by moving in the tree.For the convenience of the description for feature region,it establishes shape modification general expression with second order central moment based on vector,and reduces the dimensionality of the general expression to 2D covariance matrix,modifies irregularly regions to ellipse.Under the condition of ensuring repeatability, the EMSER-A recedes time complexity from O(nloglogn) to O(Na(N)).The experiments of extraction and matching show that,even on the occasions of generalized affine transformations,the detector can extract numerous distinctive local feature.The comparison experiment in repeatability proves that EMSER-A has the invariant in viewpoint,scale, rotation,brightness and blur.For resolving the problems including the lower location precision of the feature point, difficulty to modifying affine shape of the region and the poor affine invariance of the feature, the dissertation also constructs the detector based on point regions.It demonstrates the scale invariance of normalized Laplacian-of-Gaussian image derivatives,which provides theoretical fundament for characteristic scales.It presents qualitative and quantitative definition,also further improves and develops the properties of characteristic scales.It systematically testifies to the affine invariance of shape-adapted matrix and the rotation relation between the two normalized regions based on the matrix.It establishes the determining principles of the integration scale and the differentiation scale,and analyzes the impact of differentiation scale on the antinoise performance.It builds the space location iterative matrix based on affine shape-adapted matrix and realizes the conversion from the normalized regions to the image domain.With the support of theories mentioned above,Location/Scale/Shape-Iterative(LS~2-I) is constructed combined with the normalized Laplacian-of-Gaussian,multi-scale Harris measure,affine shape-adapted matrix and space location iterative matrix,and synchronously iterates the space location,characteristic scales and affine neighborhood shape of point regions.The experiments of the extraction and the matching show that,when extracting feature regions in the two images having the same scene with the generalized affine transformation,through the synchronous iterate in location,scale and shape,the neighborhood of convergence feature points in normalized image regions has the same content and satisfactory matching results are received.The comparison experiment in repeatability demonstrates that,under the changes of viewpoint,scale,rotation,brightness and blur,LS~2-I has good characteristics of repeatability.To testify the matching and stability of EMSER-A and LS~2-I,the paper constructs relevant image retrieval mechanism.Firstly,SIFT descriptors are produced in EMSER-A and LS2-I which are used as local feature regions at low level and then cluster into the visual vocabulary based on vector-quantized,by standard weighting and similarity rule for measuring invariant feature,query region is selected by the rectangle in the retrieval image. Finally,the similarity score is obtained according to which the image results are ranked at the first.To make full use of the invariance of the feature regions,it presents the spatial consistency measurement rule based on regions matching method in searching unit and family-based.The principles of the former are as follows:in terms of affine covariance of the elliptical feature regions of EMSER-A and LS~2-I,in the object regions and the retrievaling image the two elliptical regions which have matched are respectively taken as searching units where the regions are matched based on the relation between the original matching and the new matching,and delete the region which has the zero score.The later is on the basis of the relations among the diffirent correct matching pairs;we propose a filtration method to remove false matching:family-based spatial consistency filtration.At last,the frenquency ranking is weighted to be the spatial consistency reranking.The dissertation establishes four different retrieval mechanisms.Retrieval experiments prove the matching of EMSER-A and LS~2-I,and validity of the retrieval mechanism based on the object region and the spatial consistency. Meanwhile,comparison experiment by enlarging the capacity of images database and the experiment of the generalized retrieval demonstrate stability and correctness of the two feature regions and retrieval mechanism respectively.

  • 【分类号】TP391.41
  • 【被引频次】42
  • 【下载频次】2618
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络