节点文献

多类别模式分类技术及其在多媒体分析上的应用

Multi-Label Classification with Its Application into Multimedia Analysis

【作者】 齐国君

【导师】 张宏江;

【作者基本信息】 中国科学技术大学 , 模式识别与智能系统, 2009, 博士

【摘要】 多媒体自动概念标注是在语义层次上进行视频浏览、搜索的关键技术。这方面的研究经历了两个阶段。第一个阶段使用二值分类算法检测概念集中的每个概念,并达到了一定得准确度。但是这种方法完全忽略了概念类别之间的关系。第二阶段的方法在单独检测单个概念的基础上添加了一个语义融合的步骤来通过挖掘概念之间的关联以此提高标注的准确度。但是这种方法会将第一步的分类错误引入第二步中造成“误差传播”的问题。为了解决上述问题,我们提出一种新的同时对单个概念与底层特征关系以及概念之间关系进行建模的方法,称作关联多类别方法(Correlative Multi-Label,简记CML)。我们在TRECVID数据集上与现有的算法进行了比较,并得到了满意的结果。另一方面,一般的主动学习算法可以在样本的维度上动态地构建训练集。尽管这种方法在一般的二值分类问题上取得了满意的结果,然而对于多类别问题而言不是最优的解决方法。我们认为,对于每个选出的样本,仅仅其中的一些有效类别需要被标注,而其它的类别可以通过类别之间的关系推断出来。这是因为考虑到类别的关联性,不同的类别对最小化分类误差的贡献是不同的。因此,我们提出一种通过选择样本-类别对来最小化多类别贝叶斯分类误差界的方法,我们称之为二维主动学习算法,因为它在设计主动学习策略时同时考虑了样本维度和类别维度。进一步,由于训练样本随着时间会不断增加,如果使用基于重训练策略的多类别分类器,会大大增加计算的强度。我们开发了一种高效的在线模型,它能够仅利用新到达的数据即可动态地更新当前的模型,大大提高了算法的效率。我们在两个标准数据集以及一个从Corbis网站上得到的真实数据集来测试上述的算法,并得到令人满意的结果。

【Abstract】 Automatically annotating concepts for multimedia is a key to semantic-level video browsing,search and navigation.The research on this topic evolved through two paradigms.The first paradigm used binary classification to detect each individual concept in a concept set.It achieved only limited success,as it did not model the inherent correlation between concepts,e.g.,urban and building.The second paradigm added a second step on top of the individual-concept detectors to fuse multiple concepts.However, its performance varies because the errors incurred in the first detection step can propagate to the second fusion step and therefore degrade the overall performance. To address the above issues,we first propose a third paradigm which simultaneously classifies concepts and models correlations between them in a single step by using a novel Correlative Multi-Label(CML) framework.We compare the performance between the proposed approach and the state-of-the-art approaches in the first and second paradigms on the widely used TRECVID data set.We report superior performance from the proposed approach.On the other hand,conventional active learning dynamically constructs the training set only along the sample dimension.While this is the right strategy in binary classification,it is sub-optimal for multi-label image classification.We argue that for each selected sample,only some effective labels need to be annotated while others can be inferred by exploring the label correlations.The reason is the contributions of different labels to minimizing the classification error are different due to the inherent label correlations.To this end,we propose to select sample-label pairs,rather than only samples,to minimize a multi-label Bayesian classification error bound.We call it two-dimensional active learning because it considers both the sample dimension and the label dimension.Furthermore because the number of training samples is increasing rapidly over time due to active learning,it becomes intractable for the offline learner to retrain a new model on the whole training set.So we develop an efficient online learner to adapt the existing model with the new one by minimizing their model distance un- der a set of multi-label constraints.The effectiveness and efficiency of the proposed method are evaluated on two benchmark datasets and a realistic image collection from a real-world image sharing website - Corbis.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络