节点文献

基于直推式多示例学习的图像分类算法研究

Research on Image Classification Algorithms Based on the Transductive Multi-instance Learnig

【作者】 汪旗

【导师】 贾兆红; 李龙澍;

【作者基本信息】 安徽大学 , 计算机应用技术, 2013, 硕士

【摘要】 随着多媒体、英特网等技术的快速发展,以及近年来数码产品的普及,产生的数字图像呈爆炸性地增长。如何对这样的海量数字图像进行有效地管理并将其应用到各个领域已经成为新的研究热点,其中如何对这些图像进行分类就是一个急待解决的重要问题。传统的图像分类方法通常基于人工标记,但这种方法存在着两个难以解决的问题:一是受制于人自身的因素,人工标记的图像往往带有强烈的主观性;二是人工标记图像工程浩大,费时费力,难以大量地进行。基于内容的图像分类技术发展于上个世纪九十年代,该方法通过提取图像的底层特征,再进行一系列的处理与学习,最终达到分类的目的。虽然基于内容的图像分类技术已经取得一些研究成果,但是已有的方法通常基于图像的单一特征进行处理,而图像中包含的内容通常不止一个,单一的特征不足以充分地描述图像,而多示例学习方法由于其特殊性可以很好的处理这个难题。本文在深入研究多示例学习及支持向量机的基础上,提出了两种新的多示例学习方法来解决图像分类问题。本文的主要研究内容如下:1、提出了一种基于直推式支持向量机技术的多示例学习算法DD-TSVM。该方法采用多样性密度算法寻找示例空间中的局部极值点,采用寻找到的局部极值点来构建特征空间,将包非线性映射到重新构建的特征空间中去,最后采用直推式支持向量机来训练分类器。该算法有效利用了未标记样本,基于Corel图像数据库的实验结果表明,DD-TSVM算法具有良好的性能。2、针对多示例学习训练数据中存在冗余数据的问题,提出了一种新的多示例学习算法DDRS-TSVM。该算法在DD-TSVM算法的基础上引入基于邻域的粗糙集技术来处理多示例学习训练数据,消除冗余数据对算法分类效果的影响,基于Corel图像集的实验结果表明,DDRS-TSVM算法效果较DD-TSVM算法有所提高。

【Abstract】 With the rapid development of the multimedia and Internet technologies, as well as the popularization of the digital products, the number of all kinds of digital image increases explosively. Hence, how to manage and apply these digital images to every field effectively has become a new research hotspot, where classifying is one of the urgent issues. The traditional method of image classification is generally based on the images that have been labeled manually. However, there are two intractable problems. Firstly, the effectiveness of the method is restricted by the human itself. In another words, manual annotation on images is often susceptible to intensive subjectivity. Secondly, manual annotation is too time-consuming and arduous to apply to a large number of images. Research on content-based image classification starts from the90’s of the last century. CBIC classifies images by processing and learning from the extracted the low-level features. There have been great achievements in CBIC, and only the one single feature is generally used in the methods. Since there is more than one object in an image, it is not enough to use one feature to describe the image. The method of multi-instance learning (MIL) can deal with the above problem. By intensively studying MIL and support vector machine (SVM), we proposed two new MIL methods to classify images.The main contributions of this paper are as follows:1. Based on transductive support vector machine (TSVM), we provide an MIL algorithm (DD-TSVM). First, the diverse density algorithm (DD) is used to find the local optimization points in the instance space, by which the feature space is constructed. Then the bags are nonlinearly mapped into the feature space. Finally TSVM is used to train the classifier. The proposed algorithm effectively takes advantage of the unlabelled samples. The experimental results on Corel dataset show that DD-TSVM algorithm has good performance.2. Aiming at the redundant data existed in the training data; we provide a MIL algorithm combined with feature reduction (DDRS-TSVM). A rough set based on neighborhood is incorporated in DD-TSVM algorithm to manipulate the MIL training data, which eliminates the influence of redundant data on classification. The experimental results on Corel dataset demonstrate the performance of DDRS-TSVM, which outperforms DD-TSVM.

  • 【网络出版投稿人】 安徽大学
  • 【网络出版年期】2013年 11期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络