节点文献

Web图像语义分析与自动标注研究

【作者】 许红涛

【导师】 施伯乐;

【作者基本信息】 复旦大学 , 计算机软件与理论, 2009, 博士

【摘要】 Web图像通常关联着多种不同类型的信息,如图像本身的视觉特征(颜色、纹理、形状等)、关联的文本信息等,其语义内容或多或少地都与这些关联信息有关。图像的视觉特征空间和语义概念空间之间存在着巨大的“语义鸿沟”,使得基于视觉内容的图像语义自动标注方法的性能远远达不到人们的预期。而Web图像关联的文本信息更加接近Web图像的语义空间,因此利用Web图像的关联文本揭示其语义内容是Web图像语义自动标注的一种重要手段。然而,Web图像的语义内容在其关联文本上的分布是复杂多变的,不同的图像或语义关键词通常对应不同的语义分布。多数已有的Web图像语义自动标注方法或者把所有关联文本作为一个整体,或者仅仅根据先验知识或启发想法提前估计一个固定的语义分布模型,因此,Web图像语义自动标注的性能仍有待进一步提高。本文围绕Web图像语义内容在其关联文本上分布的复杂性和个异性特点,利用自适应学习的思想对Web图像语义自动标注开展研究,在多个方面进行了新的尝试,提出了多个具有较好性能的Web图像语义自动标注方法。本文还将Web图像语义自动标注应用到Web多媒体信息搜索中,对图文并茂的搜索方式进行了初步的尝试。本文主要研究内容如下:1.提出基于关联文本位置权重自适应学习的Web图像语义自动标注方法:通过基扩展的方法进一步考虑关联文本之间的高阶结构关系对预测Web图像语义内容的贡献,并提出利用一种新颖的分段惩罚加权回归模型对Web图像的语义内容在其关联文本上的分布进行自适应建模。实验证明所提出的Web图像语义自动标注方法大大提高了标注性能。2.提出基于自适应模型的Web图像语义自动标注方法:在基于关联文本位置权重自适应学习的Web图像语义自动标注方法的基础上,进一步考虑Web图像的视觉特征和先验知识对预测Web图像语义内容的贡献,提出利用受约束的分段惩罚加权回归模型对Web图像的语义内容在其关联文本上的分布进行自适应建模。实验证明所提出的Web图像语义自动标注方法大大提高了标注性能。3.提出基于条件随机场模型的Web图像语义自动标注方法:利用条件随机场模型将Web图像相关的各种不同类型的信息有效地集成起来,充分发挥各种信息对预测图像语义内容的贡献。特别地,提出利用Flickr标签(tag)资源来学习标注词之间的语义共现性。实验证明所提Web图像语义自动标注方法和基于Flickr标签的标注词之间的语义共现矩阵大大提高了标注的性能。4.提出一种基于标注的Web多媒体信息搜索原型系统:在传统搜索引擎和Web图像语义自动标注的基础上,提出了一个Web多媒体信息搜索原型系统:PictureBook。PictureBook系统利用Web搜索结果聚类、多文档文摘和Web图像语义自动标注等技术,将Web页面搜索和图像搜索有效地结合在一起,为用户返回图文并茂的搜索结果,从而更加便于用户获取知识。

【Abstract】 Various types of information are usually available for Web images, such as the basic visual features (color, texture, shape etc.) and the associated textual features. It is well known that the semantics of Web images are well correlated with these associated informations. The previous research work demonstrate that there exist a huge "semantic gap" between the low-level visual space and the upper-level semantic space of images, and this results in the poor performance of visual content based image semantic annotation. The associated textual space is closer to the semantic space of Web images than the visual space, so they can be well used to infer the semantics of Web images. However, the relation between the semantic contents of Web images and their features is very intricate and various, and different annotation keywords or Web images usually correspond to different semantic distributions. Most previous work either regard the associated texts as a whole, or assign fixed weights to different types of associated texts only according to some prior knowledge or heuristics, and the performance of Web image semantic annotation is still need to be further improved.This paper studies the intricacy characteristic of semantic distribution of Web images. Based on the adaptive learning idea, several Web image annotation methods with good performance are proposed.The main works of this paper are as follows:1. Position weights adaptive learning of the associated texts based automatic Web image semantic annotation approach is proposed: we use the basic expansion method to further consider the semantic contributions of the high order structure relation between different types of the associated texts, and propose a piecewise penalized weighted regression model to adaptively model the Web image’s semantic distribution on the corresponding associated texts. The experimental results on a real world benchmark show that our method can improve the annotation performance greatly.2. The adaptive model based automatic Web image semantic annotation approach is proposed: this method further leverages the visual features and the prior knowledge to improve the annotation performance on the basis of position weights adaptive learning of the associated texts based automatic web image annotation approach. To incorporate the contribution of prior knowledge, we propose a constrained piecewise penalized weighted regression mode to adaptively model the Web image’s semantic distribution on the corresponding associated texts. The experimental results on a real world benchmark show that our method our method can improve the annotation performance greatly.3. The conditional random field model based automatic Web image semantic annotation approach is proposed: this method provides a unified annotation framework to combine different types of features of Web images to improve annotation performance. We further explore the manually tags resources of Flickr to improve the estimation of semantic correlation between annotation keywords. The experimental results on a real world benchmark show that our method outperforms the state-of-the-art Web image annotation method.4. A Web multimedia information retrieval prototype system is proposed: wepresent a novel Web multimedia information retrieval prototype system-PictureBook, which combines text and image retrieval using techniques of search results clustering, multiple document summarization and Web image semantics analysis. Particularly, audience can interactively investigate the effect of the combined text and image search results in Web information searching and knowledge acquisition.

  • 【网络出版投稿人】 复旦大学
  • 【网络出版年期】2009年 12期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络