节点文献

图像分类和图像语义标注的研究

The Study on Image Classifition and Image Annotation

【作者】 张磊

【导师】 马军;

【作者基本信息】 山东大学 , 计算机系统结构, 2008, 硕士

【摘要】 随着多媒体技术的发展和Internet的普及,人们获得各种多媒体信息越来越容易,其中图像是数量最多的一种,如何有效地、快速地从大规模图像数据库中检索出所需要的图像已成为人们日益关注的问题。基于内容的的图像检索技术(Content-Based Image Retrieval,CBIR)利用图像的底层视觉特征(颜色,纹理,形状等)代表图像的内容,由于图像的底层视觉特征与图像的语义表达之间存在“语义鸿沟”,传统的CBIR技术不能满足人们按语义检索图像的需求。如果事先对图像集合按语义进行合理地分类或者标注,会极大提高CBIR系统的性能。本文主要研究基于图像底层视觉特征的图像语义分类和语义自动标注。本文的主要贡献在以下几点:1.提出了一种基于Gabor变换和支持向量机(Support Vector Machine,SVM)的纹理分类算法,该算法具有旋转不变性。在实验过程中,为确保分类器对旋转后的图像特征“一无所知”,训练集和测试集分别选自不同旋转角度图像的上半部分和下半部分,保证了本实验是一个真正意义上的旋转不变实验。在Brodatz和UIUCTex两个数据集中的实验表明,该纹理分类方法是有效可行的,在某些类别上的分类准确率可以达到100%,分类准确率和时间复杂度均优于kNN(kNearest Neighbors)算法。2.提出一种基于SVM并综合MPEG-7视觉描述子的图像分类算法。由于图像集中有多个语义类别,使用多类分类策略构建一个多类SVM分类器。图像特征使用MPEG-7 Experimentation Model软件从图像中提取。在实验中用到了多种颜色和纹理描述子,对比了各种描述子结合SVM分类器在Corel 1K图像集中的分类准确率和时间复杂度。实验同时表明,合理地综合使用多种视觉描述子可以取得更高的分类准确率。3.提出了一种基于SVM分类器的图像语义自动标注算法。图像特征是基于MPEG-7颜色和纹理描述子的全局特征。每个标注词对应一个二分SVM分类器,针对多个语义词,利用多类分类策略构建一个多类分类器,这就建立了图像底层特征与语义词之间的关联。SVM分类器的输出采用后验概率形式,以方便地比较图像属于各个语义词类别的可能性。实验在Corel 5000数据集中进行,首先使用Poner stemming算法对所有语义词进行stemming操作,并舍弃图像数过少的语义词,共有82个词可用于构建分类器。实验过程中采取了两种策略选取标注词,并对比了两种策略的实验结果。评价标注结果时,使用了分别针对图像和标注词的准确率和召回率,结果评价更加客观、全面。

【Abstract】 With the development of multimedia technology and the popularization of Internet, people can acquire multimedia information in large amount. How to retrieve the images from image database precisely and efficiently has been an important issue in the field of image retrieval.Content-Based Image Retrieval(CBIR) extracts visual features as retrieval features, such as color, texture and shape, etc. For the existence of semantic gap between low-level image features and human understanding to images, CBIR can’t get satisfied retrieval results. Classifying images into reasonable categories using low-level features or annotating images will greatly improve the performance of CBIR systems. This thesis does a study of image classification and image annotation. The main contributions of this thesis are as follows:1. Propose a method of rotation invariant texture classification using Gabor transform and Support Vector Machine(SVM). To make sure the classifier knows nothing about the characters of rotated images, we create the training set from the subimages from the top half of none rotation image. The subimages from the foot half of rotated images are grouped to the testing set. This method is tested on Brodatz and UIUCTex datasets and the experimental results demonstrate that it is effective and efficient. The precision can be as high as 100% in some classes.2. Propose a method of image classification based on MPEG-7 color and texture descriptors, using SVM as classifier. For there are several classes in image dataset, the approach constructs the multi-class SVM with the help of multi-class classification strategy. Image features are extracted using MPEG-7 Experimentation Model software. The experiment with Corel 1K utilizes several color and texture descriptors. Classification precision and time complexity are given.. The results show that if we properly fuse the MPEG-7 descriptors the higher precision can be achieved.3. Propose a method of image annotation using MPEG-7 descriptors and SVM The image features are global features based on MPEG-7 color and texture descriptors. The method builds a binary SVM according to each word. For there are a lot of words usually, the method constructs the multi-class SVM with help of multi-class classification strategy. Therefore, this multi-class SVM establishes a mapping from images to words. The output of SVM classifier is modified to posterior probability form so we can get the probability estimates. In the experiment with Corel 5000 dataset, the method use Porter stemming algorithm as the first step. By eliminating the words with so few images, 82 words are used to build SVM classifier. The mean per-word precision and recall as well as mean per-image precision and recall are adopted for evaluating annotation effectiveness.

  • 【网络出版投稿人】 山东大学
  • 【网络出版年期】2009年 01期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络