节点文献

基于词义的汉语排歧方法研究

【作者】 刘亚清

【导师】 陈次白;

【作者基本信息】 南京理工大学 , 情报学, 2004, 硕士

【摘要】 一词多义是普遍存在的语言现象,但在具体的上下文中一个词语就只有一个确定的意思,如何在具体的语言环境中确定多义词的词义是词义排歧所要研究的内容。本文主要针对汉语词义排歧的问题做了相关的探讨。首先给出了词义排歧研究的目的及其意义,接着根据排歧时所使用的不同的知识源介绍了目前比较常用的几种词义排歧方法,并对其中一些典型的方法做了较为详细的讲解;然后借助句法分析树,运用“中心词关联法”来提取表征多义词词义能力较强的特征词;在此基础上,通过计算多义词每个词义与特征词之间不同义原的相关系数,提出了一种基于义原同现频率的词义排歧方法。最后,根据本文所讨论的主要内容提出了一种汉语词义排歧系统的开发思路,并对其中一些模块进行了代码实现。

【Abstract】 It is a universal phenomena in the language that a word possesses many senses, but when a word is in the context it only possesses a certain sense.It is the primary studied content in the field of word sense disambiguation how to confirm the sense of a word in the context. Word sense disambiguation of Chinese will be discussed in this thesis.The author introduces the aim and meaning of word sense disambiguation of Chinese firstly.In succession, The author narrates several methods of word sense disambiguation and explains the theories of some typical methods at length. Afterwards, At the base of parsing tree, the author uses associating headword method to distill the character words the ability of which is strong in expressing the senses of ambiguous word ; Afterwards, the author bring forwards a kind of method of word sense disambiguation which is based on the simultaneous arisen frequency of primitive by calculating the related moduluses of primitive between the senses of ambiguous word and character words.At last,the author put forwards the idea of empoldering a system of word sense disambiguation according the primary content of the thesis and some coding experiments of its core modules are conducted.

  • 【分类号】H13
  • 【被引频次】3
  • 【下载频次】171
节点文献中: 

本文链接的文献网络图示:

本文的引文网络