节点文献

基于树核的无指导中文语义关系抽取研究

Research on Unsupervised Chinese Semantic Relation Extraction Based on Tree Kernels

【作者】 黄晨

【导师】 钱龙华;

【作者基本信息】 苏州大学 , 计算机技术, 2009, 硕士

【摘要】 命名实体语义关系抽取是信息抽取中的主要任务之一。在中文语义关系抽取方面,有指导的学习方法占主导地位,目前还没有采用无指导学习方法的相关研究。同时,由于树核函数在英文语义关系抽取中取得了一定的成功,因此本文提出了采用树核函数的方法来实现无指导的中文语义关系抽取。本文把无指导的语义关系抽取看作是一个用句法树表示的关系实例的聚类问题。首先提取中文语句中存在语义关系的实体对作为关系实例,并采用句法树中的最短路径包含树作为它们的结构化表示形式;然后,利用卷积树核函数的方法计算两棵句法树之间的结构相似度;最后,选用自底向上的层次聚类算法,以完全连通和平均连通作为簇相似度计算方法,将关系实例聚类到不同的簇中,从而实现无指导的中文语义关系抽取。在ACE RDC 2005中文基准语料库上的实验表明,采用该方法的关系大类抽取和关系子类抽取的F值分别达到了60.1和44.6,这说明基于树核函数的无指导学习方法在中文语义关系抽取上是有效和可行的。

【Abstract】 Semantic relation extraction between named entities is one of the main tasks in the field of information extraction. As for Chinese semantic relation extraction, supervised learning methods dominate in this area, while so far there is no unsupervised learning. Motivated by the success of semantic relation extraction based on convolution tree kernel in Englishtexts, this paper proposes a convolution tree kernel-based approach for unsupervised Chinese semantic relation extraction.We cast the task of supervised relation extraction as a problem of clustering relation instances expressed as parse trees. First, all NE pairs with potential relationship existing in Chinese sentences are extracted, with the shortest path-enclosed trees as their structural representations;Then, the similarities between two parse trees are computed based on convolution tree kernel; Finally, a bottom-up hierarchical clustering algorithm,together with cluster similarity computation methods such as maximum linkage and average linkage, is used to group the relation instances into different clusters, thus the task of Chinese unsupervised relation extraction is performed.Evaluation on the ACE RDC 2005 Chinese benchmark corpus shows that the approach achieves the F-measure of 60.1 and 44.6 for major type relation extraction and subtype relations extraction. This suggests that our method is reasonable and effective for unsupervised Chinese relation extraction.

  • 【网络出版投稿人】 苏州大学
  • 【网络出版年期】2011年 S2期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络