

An Exploratory Study of TEM-4 Reading Task Difficulty

【作者】 侯艳萍

【导师】 邹申;

【作者基本信息】 上海外国语大学 , 英语语言文学, 2008, 博士


【摘要】 英语专业四级考试(Test for English Majors--Grade 4,缩写为TEM-4)是目前中国唯一的一项专为英语语言文学专业学生基础阶段设计实行的大规模标准参照性教学检查考试。它根据《高等学校英语专业英语教学大纲》的要求,由全国外语专业指导委员会下属英语专业教学分指导委员会负责组织命题,外语专业教学指导委员会办公室负责实施。TEM-4的目的在于推动高校英语专业教学大纲在基础阶段的贯彻与执行,对英语专业学生的各项基本技能和英语实际运用能力进行客观和准确的测量,从而为提高我国英语教学的质量而服务。测试的基本目的之一是为进行决策而收集必要的信息。任何一项测试,要想做到权威、公正、客观,必须如实测量其拟测验的内容,这样,合理的考试任务难度就是出题者把握试卷水平以及考试使用者对考试分数进行合理解释的一个至关重要的前提,对于考试设计者,受试者和使用者都具有重要意义。TEM-4也不例外。然而,目前对于考试任务难度的研究却并不多见。本研究试图以TEM-4的阅读理解部分为切入点,从阅读任务特征的角度,对于影响TEM-4阅读理解任务难度的因素进行剖析,以期能更好地认识TEM-4阅读任务的实质,对其形成客观公正的理解。本研究的主要目的有两个:一是确定可能对TEM-4阅读理解任务难度水平产生影响的主要任务特征有哪些;二是明确这些系统变化的任务特征的共同构念和测量属性,以及对任务难度的具体影响程度有多大,从而明确在多大程度上可以由这些任务特征去预测任务难度。本研究以2005年和2006年的共8篇阅读理解文章及其40道题目作为研究材料。以外语专业教学指导委员会办公室提供的参加上述两年TEM-4考试的约300,000名考生的阅读成绩为对象。此外,本研究还设计了一个包括143个变量的评价工具,分别从命题内容(包括主题、体裁、词汇等)、组织特征(包括修辞方法、语法、修辞结构、连贯、照应等)、语用特征(言语行为和语言功能、语域等)、长度以及其他与语言相关的特征(文章、段落、句子等的长度,句子类型、指代、否定等)诸多方面分别对文章、关键句和题目三个层面进行分析。每个层面中具体变量各有不同。按照不同的评判方法,本研究又将这143个变量分为三类:计数型,计算型和评分型。前两类由研究者自己完成,评分类变量由10位专家依据专门设计的评分量表完成。之后对于所获得的原始数据采用多种手段进行处理。描述性分析用于基本数据探索,信度分析用于检验评分一致,变量标准化处理为研究提供统一比较的基准,相关分析用于确定对阅读理解任务难度影响较大的个体变量,探索性因子分析用于因子纬度探索,验证性因子分析用来确定阅读任务特征模型,多重线性回归用于确定对任务难度的预测比例。研究结果显示,在个体变量的层面上,共有22个任务特征变量可以对任务难度产生较为显著的影响,可分别归类为:语法复杂程度、信息抽象程度、词汇、新信息、修辞结构复杂程度、语域、推理类型、长度、关键句显著性、题目内容专业程度、否定等11类。而且属于文章层面的变量对于任务难度的影响要大于关键句层面和题目层面的变量,这一发现也将丰富我们对于任务难度预测变量的正确认识,改变长久以来认为阅读理解中题目因素是决定任务难度的主要原因的观点。对于这些变量的进一步构念研究显示,整合后突显的公用因子共三个,分别为文章复杂性因子,文章长度因子和关键句显著性因子。最后的回归分析结果确定TEM-4阅读理解任务难度中约有31.2%可以用上述因子来解释,证明了一定程度上任务特征对于任务难度的可解释性。结论部分指出了本研究在理论上、方法上、以及语言测试的具体实践上的一些启示,并且指出了研究所存在的问题及今后研究的方向。

【Abstract】 As the only large-scale nationwide standardized test for English majors at the foundation stage in China so far,TEM-4(Test for English majors,Grade 4) is administered by the national foreign language teaching advisory committee under the higher education department of the Ministry of Education.With the Teaching Syllabus for English Majors being the basic guiding principle,and the English sub-committee of the national foreign language teaching advisory committee being in charge of test construction,Tem-4 is a criterion-referenced test aimed at the evaluation of the English teaching quality in China. The purpose of TEM-4 is to promote the implementation of the teaching syllabus,to measure English majors’ ability to use English,so as to contribute to improving the English teaching quality in China.One of the primary purposes of a test is to provide necessary information of the test-takers for decision-making.In order to be scientific,unbiased and authoritative,a test is supposed to be able to test what it purports to measure.Thus,one of the challenges facing test designers and test users who are concerned with gauging the influence of task characteristics on candidate performance is how to determine the difficulty of tasks.A greater understanding of the factors affecting task difficulty can assist in the choice of a suitable range of tasks for assessment purposes and also has the potential to influence the way levels of test performance are described.It is now well understood that aspects of test task di(?)ficulty can have an important effect on test performance and it would thus seem imperative to incorporate information about test task difficulty explicitly into the design of language tests and,more importantly,into the interpretation of test scores.But test task difficulty study is a much neglected topic.In this research,identifying the variables which uniquely account for significant variance in the percent correct obtained by examinees for each item in the TEM-4 reading comprehension part is a major focus.This dissertation tries to explore the relationship between reading task difficulty and reading task characteristics,with a view to forming a better understanding of the nature of the TEM-4 reading task.The principal aims of the present study are:1) To identify key task characteristics and task conditions that are most likely to affect the difficulty of TEM-4 reading tasks.2) To investigate the impact on test scores from systematically varying task characteristics and task conditions and,in cases where clear effects are noted,to explore possible reasons for differences in task difficulty;to figure out the factor structure or potential relationship among these characteristics,and to identify the acco untability of item difficulty in the TEM-4 reading section by the set of factors specified in this research.The research materials are the eight reading passages and the corresponding 40 items on the 2005 and 2006 TEM-4 papers.More than 300,000 test-takers’ test scores are processed for the item difficulty index.A rating instrument with 143 test task characteristic variables is constructed,including propositional content variables(subject matter,genre,vocabulary and other aspects),organizational characteristics variables(rhetorical features of patterns, grammar,rhetorical organization,coherence,cohesion),pragmatic characteristics variables (speech acts and language functions,register),length and other language-related variables (number of words,different types of sentences,negations and frontings).These are performed on three levels:passage,key sentence,and item.What’s more,these variables are of three different types:counted,mathematically calculated and rated.The first two types are conducted by the researcher herself,and the rated variables are collected from 10 experts using a specifically designed rating sheet.Multiple techniques are exploited to analyze the raw data.Descriptive statistical analysis, reliability of ratings,standardization of variables,pair-wise correlation,exploratory factor analysis,confirmatory factor analysis and multiple regressions are all employed.The results indicate that 22 reading task characteristics variables are found responsible for the TEM-4 reading task difficulty,among which passage variables take up a far more important share than key sentence variables and item variables.Three salient constructs are discovered,namely,passage complexity factor,passage length factor and key sentence salience factor,and they account for 31.2%of the variation in the reading task difficulty.The conclusion part summarizes the answers tO the research questions and points out the theoretical,methodological implications and implications for language testing practices. The limitations and future research suggestions are mentioned in the end.

  • 【分类号】H319
  • 【被引频次】4
  • 【下载频次】897