节点文献

面向话题的事件信息融合研究与实现

Research and Implementation of Chinese Cross-Document Topic Event Fusion

【作者】 许荣华

【导师】 钱培德;

【作者基本信息】 苏州大学 , 计算机应用技术, 2009, 硕士

【摘要】 事件信息抽取(Events Information Extraction: Events IE)目前是信息抽取(Information Extraction: IE)中的一个重要领域。本文提出了一种跨文本事件信息融合方法,该方法在事件IE的基础上引入了多源信息融合理论,并结合命名实体识别、指代消解等其它信息抽取技术,对多源、多文本同话题事件进行信息融合。本文的主要内容包括元事件融合和话题事件融合两部分,具体内容如下:1.在元事件融合中,考虑到自然语言表述的多样性,对事件描述中的事件元素进行规格化处理,并针对事件元素中的时间信息、命名实体和数字信息的不同表述特点,采用不同的规格化方法;2.在共指元事件聚类过程中,由于事件描述中常出现事件元素的缺失,为了提高共指元事件聚类的召回率,提出了关键元素集合的概念。并针对事件信息的特点,利用事件中的语义和语用信息提出一种适用于事件信息的相似度算法;3.在事件元素融合时,在元素的基本可信度上,针对各类事件元素的不同表述特点,根据元素的精度和准度不同调整元素的可信度,提高精度高的元素值被选中的概率。在元素选择时,在可信度计算的基础上,采用了投票策略,增加了最后结果的可信度;4.在话题事件融合中,为了能更好地表示话题型事件,本文定义了一种基于元事件的话题事件表示模型(Event-based Topic Description Model: ETDM)。该模型可有效地将话题事件进行结构化和层次化表示,接近人类的认知模式,同时可根据不同需要进行信息定制。最后给出了话题事件的融合方法。实验表明,本文元事件融合可以有效合并事件信息,大大降低了信息系统的冗余度,完善了单个事件信息,通过对多源信息的冗余性和互补性进行融合,达到增加目标特征矢量的维数,降低信息的不确定性,改善信息的置信度等目的。对话题事件的融合不仅能有效地将相关事件联系起来,并能将整个话题以层次化、结构的形式表示。

【Abstract】 Event Information Extraction (Event IE) is an important point in the area of Information Extraction (IE). In this dissertation, we provide a method to achieve cross document event information fusion. This method is at the basis of Event IE and combination of information fusion basic theory and other information extraction technologies, such as Named Entity Recognition and Co-reference Resolution, etc. This dissertation includes two main parts, basic event information fusion and topic event information fusion.1. Before the event fusion process, we must standardize the event roles, such as time mention, named entity mention and so on, because the natural languages’representation is diversity. So we standardize each type of entity in same format based on its own characteristics.2. Event mentions always omit some event roles, so we defined the key roles set to improve the recall of the co-reference basic event clustering. And then based on the characteristics of events, this dissertation proposes an approach to calculate the similarity of two different events by using the pragmatics and semantics information of the event tagging.3. In the process of the event role fusion, this dissertation introduces the trustworthiness to improve the performance. It adjusted the trustworthiness by the precision of the candidate to improve the probability of the roles with high precision. Furthermore, it adopts the Frequency Voting method to select the event roles and then to increase the trustworthiness.4. In the process of topic event fusion, we define an Event-based Topic Description Model (ETDM), which can hierarchize and structure the topic and that behavior is similar with the cognitive model of human. It also provides a fusion approach to fuse topic events.The experimental results show that the event fusion method is useful to fuse the event mentions and organize the relative events. It can reduce the information redundant sharply and then consummate the event information. The topic fusion method is also useful to contact relative events, and organize the topics in hierarchy and structure form.

  • 【网络出版投稿人】 苏州大学
  • 【网络出版年期】2011年 S2期
节点文献中: