

The Study of Verb-Object Construction in Machine Translation

【作者】 黄淑美

【导师】 吴振国;

【作者基本信息】 华中师范大学 , 语言学及应用语言学, 2012, 博士

【摘要】 本文立足于汉英机器翻译动宾结构,探讨动词和不同名宾组合的重新归类,以提供机器翻译表现。翻译是一项复杂的过程,从认知角度来看,翻译首要做到是明白源语文本的意思,当中包括要从语义、句法和语用等层面去了解源语文本,此外,在翻译过程中,还要有(1)源语的知识;(2)目的语的知识;(3)对比源语和目的语等知识;(4)有关课题和一般知识;(5)源语和目的语的背境文化知识。无论何等复杂,生活在现今信息科技时代,全球变得一体化,打破语言障碍刻不容缓,故此研究机器翻译及语言学的专家学者们都不断地努力,希望有朝一日能让计算机能高效地进行多国语言自动翻译。但时下流行的网上机器翻译系统,大都不能较好地把源语文本翻译成目的语,当中以汉英翻译为甚,问题不单是出于计算机的工程或系统的设计,最根本的问题还是汉语的语法研究不能满足机器翻译的需要。因此,我们挑选了汉语动宾结构这一项目作重点研究。本文首要任务就是为动宾结构重新分类,不单从本身语言(语内)着手,同时也从汉英(语际)翻译角度出发,把动宾结构分类为核心语法、次核心语法、次边缘语法和边缘语法,核心语法是指那些在语际和语内类推能力和对译能力强的动宾结构,而边缘语法是指那些在语内及语际都不能类推和对译的动宾结构,而次核心和次边缘就是指那些介乎中间的动宾结构类别。第二个任务就是设计不同的词典以配合重新归类的动宾结构,并模拟机器处理各类动宾结构的汉英翻译。论文共分八章。第一章主要概括机器翻译发展,尤其是汉英翻译存在的问题,此外,也会考察学者们对动宾结构的不同分类及存在的问题。本文主要依据乔姆斯基的转换生成语法的“核心语法”理论.Goldberg& Jackendoff的构式语法理论等,区分“核心语法”和“边缘语法”,通过汉英自动翻译的检验,具体区分动宾结构核心语法规则和边缘语法规则,并且调整宾语的分类,明确动宾结构的搭配规则;尝试为汉英动宾结构翻译寻找或确立可行的规律。第二章则尝试从普遍语法、构式语法以及语言习得等角度来区分核心和边缘语法。接着讨论边缘语法的主要类型:由特殊词语构成的边缘语法;由粘合式结构形式构成的边缘语法;由特殊转换规则构成的边缘语法。除此之外,也从语言研究、语言教学、词典编纂和信息处理等方面讨论了区分核心语法和边缘语法的重要性。第三章主要是探讨如何区分动宾结构中的核心语法和边缘语法。第四章是对三种网上翻译系统的动宾汉英翻译测试,从中了解存在的问题。第五章和第六章分别讨论一般动宾结构和特殊动宾结构汉英翻译处理方法。例如在处理为属边缘语法的动宾式惯用语建立词典,穷尽收纳每个动词的义项并收纳于汉语动词义类词典;为汉英动宾结构翻译建立规则并置于规则库内,以便提取等。第七章是一个模拟测试,模拟实机器翻译系统处理不同类别的动宾结构并翻译成英语的过程。第八章是总结整项研究及检讨当中不足之处,以便日后继续研究改善。

【Abstract】 Translation is a complicated process. The first thing to do in translation is to understand the meaning of the text in source language (SL) by means of semantics, syntactic and pragmatics. Apart from that, it is also necessary to under the knowledge of SL; the knowledge of the target language (TL);the contrastive linguistics of SL & TL; the general knowledge and the knowledge of the field concerned and the cultural knowledge of both languages. But no matter how difficult it is, our computer engineer and the linguists still keep on their hard work to enhance the performance of the machine translation system so as to eliminate the language barriers. It is the only way to survive in this hi-tech era and the speed of globalization. It is hoped that there will be a day that language will no longer be a problem in communication among people in the world.However, in view of the current popular online machine translation systems, they are all far from satisfactory especially in Chinese-English machine translation. It is not only due to the design of the machine translation systems; one of the main reasons rests on the inadequacy in the study of Chinese grammar. Under this circumstance, we would like to pick up an important construction in Chinese language, ie verb-object construction (V-0 construction), to have a detailed study.The first objective of this thesis is to re-categorize the V-0 construction. We are going to look into this construction from both intra-language and inter-language prospective and then we’ll re-categorize different types of V-O constructions under 4 different groupings. They are core grammar, sub-core grammar, sub-peripheral grammar and peripheral grammar. For those which fall into the core grammar group will be those V-O constructions having strong analogical ability and high accuracy in word-for-word translation. Those fall into the peripheral grammar category are V-0 constructions which are not possible to perform the analogy and also word for word translation. The rest of the V-0 constructions which lie between the core and the peripheral grammar belongs to the sub-core or sub-peripheral groupings. The second task of this thesis is to help establish various dictionaries or data bases based upon the above categorization. It is hoped that these dictionaries can eventually help improve the efficiency and efficacy of the machine translation systems.There are altogether 8 chapters in thesis. Chapter 1 is an introduction of the study. Firstly, we’ll have an overview of the categorization methods suggested by various scholars in V-0 construction. The advantages and disadvantages of these groupings are discussed. Then, we’ll introduce Chomsky’s core grammar theory in Universal Grammar, the Construction Grammar theory of Goldberg & Jackendoff and the theory of Core and Peripheral Grammar in re-categorization of V-O constructions suggested by Dr WU Zhen-guo. All these theories formed the basis of our study to establish a new grouping system for V-O constructions and also set up clear cut rules for the verbs and objects which commonly co-occur in order to facilitate Chinese-English machine translation as far as possible.In Chapter 2, there is an attempt to distinguish core grammar and peripheral grammar by means of the theories of UG and CG as well as the acquisition of first and second language. Also, we are going to study the types of peripheral grammar formed such as special words/phrases, constructions formed by agglutination, special constructions formed by special rules and regulations. Apart from the above mentioned, we will also focus on the importance of re-categorization of V-O constructions in language study, language teaching, edition of dictionaries and information management.In Chapter 3, we will study the methods to distinguish core grammar and peripheral grammar in V-O constructions. In Chapter 4, the results of 3 online machine translation systems in translating V-O constructions will be brought up for detailed analysis. The inadequacies will be carefully studied.After studying the problems in V-O construction translation by the three online machine translation systems, in Chapter 5 and 6, we are going to look for solutions. The first thing to do is to put the different types of V-O constructions into 4 different categories as mentioned above and then we can establish rules and regulations for translation if required. For example, we can edit a dictionary particularly for the idioms with V-O constructions as they are being categorized under the peripheral grammar. This type of V-O constructions is not possible to be handled by word-for-word translation as these constructions are usually having a concrete meaning which cannot be understood by studying the surface meaning of the words. Or, we can set up a base with translation rules for retrieval during the Chinese English translation process. In Chapter 7,we are going to imitate the running of the machine translation system and implement the rules and setting as we discussed in Chapter 5 and 6 in order to have a preliminary test of the feasibility. Then we will have a conclusion of what we have found and established. Also, we will point out the inadequacy of our study for further improvement in the near future.

