

Glossary Studies of Yuexi Tours the Notes of Xu Xiake Travel Diary Based on Term Frequency Statistics

【作者】 张其娟

【导师】 海柳文;

【作者基本信息】 广西民族大学 , 语言学及应用语言学, 2011, 硕士

【摘要】 对于《徐霞客游记》,学界多从地理学、文学、旅游、历史、民俗、宗教、美学、文化等方面来研究,从语言文字角度进行的研究较少。我们运用计算机辅助方法进行词频统计,从语言词汇的角度,选取其中的《粤西游日记》,以它的的词汇特点和语言特色及词汇和语言的发展变化规律为研究对象,试图拓展粤西语言词汇研究的领域,为近代汉语词汇研究提供一定的资料或佐证。研究的基本思路是:1.把《徐霞客游记?粤西游日记》按照1982年上海古籍出版社出版的由褚绍唐、吴应寿合作整理本录入计算机;2.校对文本;3.用ICTCLAS分词软件进行机器分词并对分词结果人工干预;4.以SPSS统计软件统计词频;5.分析统计结果,研究词汇体系。在得出的《徐霞客游记?粤西游日记》系列词表中,有123230个词,10060个不重复的词条。按照出现频率的不同,将词汇分为四个词区:核心词区、高频词区、中频词区和低频词区。核心区只有一百词,出现率高且覆盖率广;对于高频词区和中频词区词类,通过比较,我们可以看到这两个词区的词类特点;低频词区的词汇中,专有名词占了较大的比重,将专有名词的出现频率列出并对这个大类分小类,可以比较直观的看到它的分布。除专有名词外《徐霞客游记?粤西游日记》,有7870个词条。这7870个词条分单音节词和复音节词,单音节词占优势;按复音节词的结构分析,可分为七大类,复音节词中联合式和偏正式结构比其他结构占有绝对的优势。《徐霞客游记?粤西游日记》的词汇平均词长明显低于现代汉语平均词长,这是词的复音化发展的结果。同时,《徐霞客游记?粤西游日记》词汇从衣、食、住、行四个角度反映了当时广西的民生、民俗。此外,这部专著的写作手法具有自己的特色,作者的思想有一定的局限性。

【Abstract】 For the research of Xu Xiake travel diary , the Educational circlesstudies mostly from the geography, literature, tour, history, people’scustom, religion, esthetics, and cultural...etc. less from the languagecharacter angle progress. We choose the part of Yuexi tours the notesprocess word frequency,study the vocabulary characteristics and languagespecial feature and their change law,try to expanding the realm of thelanguage vocabulary with the assistance of computer and provide certaindata or substantial evidence for modern Chinese language vocabularysearch.The basic five steps are : 1.We input the book according to 1980publication in Shanghai Ancient book Press which was tidied up by ChuShaotang and Wu Yingshou; 2.Check the text; 3.Carry on machineparticiple to combine with the ICTCLAS participle software vs theparticiple as a result manual interruption; 4. Statistics a software statisticsword frequency by SPSS; 5.The result analyzing, study vocabularysystem. In Yuexi tours the notes of the Xu Xiake travel diary seriesphrase table amid, there are 123230 phrases, 10060 phrase with repeatedanti. According to appearing the dissimilarity of frequency,we divided itinto four phrase zones:The nucleus phrase zone, high frequency phrasezone, frequency intermediate phrase zone and low frequency phrase zone.High frequency zone there are only 100 phrases, appear to lead higher -and overlay rate wide; We compared High frequency phrase zone and thefrequency intermediate phrase zone phrase to see the phrasecharacteristics of these two phrase zones;In the vocabulary of lowfrequency phrase zone, the technical term had than the bigger specificgravity and we listed the presence frequency of technical term to combineto divide a small type to this, can compare to keep seeing its distributionobjectively. In addition to technical term, there are 7870 phrases in Yuexi toursthe notes of the Xu Xiake travel diary.This 7870 phrase deci monosyllabicphrase and polyophonic knot phrase, the monosyllabic phrase gainsadvantage; Press the structure analysis of polyophonic knot phrase, candivided into seven major types, in the polyophonic knot phrase coalitiontype and is partial to formal structure to occupy absolute advantage thanother structure. Yuexi tours the notes of the Xu Xiake travel diary ofvocabulary the average phrase is more lengthways obviously low than anaverage phrase of modern Chinese language long, this is the complextone of phrase to turn the result of shape.At the same time, Yuexi tours thenotes of the Xu Xiake travel diary the vocabulary is from the dress, food,live, went four angles to reflect at that time the people’s livelihood ofGuangxi and people’s custom.In addition, this particularly the writingtechnics of its own special feature, the author’s thought have someparochial points.

【关键词】 汉语词汇词频粤西游日记文化研究
【Key words】 Chinese languageVocabularyWord frequencyTheYuexi tours the notesCultureStudy

