节点文献

计算机化的大学英语分级考试效度分析

Validation Study of the Computer-based College English Placement Test

【作者】 潘晓琳

【导师】 肖云南;

【作者基本信息】 湖南大学 , 外国语言学及应用语言学, 2008, 硕士

【摘要】 为适应建立国际性研究型大学及社会发展的需要,实行分级教学改革以提高教学质量,湖南大学大学英语分级考试项目组开发了“湖南大学计算机化的大学英语分级考试”(CCEPT)。为确保考试的质量,本研究针对新开发的考试进行了较全面的效度分析,为以后的CCEPT建设提出了一些改进的建议。本研究以现代语言测试理论为指导,检验CCEPT能否有效地反映学生的英语语言能力。综合1985年美国“教育与心理测量标准”提出的内容、效标和构念效度3种效度类型,本研究的效度框架具体集中在构念效度、内容效度和表面效度三方面。通过对考生答题情况、考生成绩、考试试卷以及考生问卷调查结果等分析,检验CCEPT的效度。所有研究的数据都使用统计软件SPSS分析。通过对试题难度和区分度的计算以及试卷内容对考试大纲所规定内容的覆盖度分析,判断CCEPT的内容效度;通过相关分析和因子分析,判断CCEPT构念效度;通过对考生的问卷调查,判断CCEPT的表面效度。分析结果表明CCEPT在构念效度、内容效度和表面效度表现出合理性,但还有许多地方有待改进。阅读理解和写作的相关系数只有+0.169,这很大程度上是由于写作部分的评分效度较低。用主成分分析法进行因子分析,说明试题共考察了“听、写能力”和“阅读能力”,但是作为一个交际语言能力测试,只有总分达050以上的考生才能进行口语测试,这将对考生产生不良的后效。整体来说,考试的内容效度也较理想。听力部分的难度系数是0.464,接近理想值0.5。但是听力部分的题目大多是涉及细节内容而不是语义功能,与考试大纲的要求不太符合。阅读部分题目对大纲规定的各项微技能考查分配合理;该部分的难度系数是0.569,最低为0.465,最高为0.708,但是Part D的难度偏高。该部分的区分度为0.27,接近理想值0.3。作文部分题材接近学生生活,但该部分的评分信度较低。对考生的问卷调查涉及到了考生计算机熟悉程度调查、对考试整体印象调查、考试公平性调查、考试可能产生的后效调查、试题所考察技能的情况调查等,结果表明大多数考生认同这次考试。本文也提出了一些建议:CCEPT应该扩大试题库以确保试题的质量和试卷的稳定性,尽量降低测试模式的影响。CCEPT也应进一步提高其信度,尤其是写作部分的阅卷信度,加强对评分员的评分培训。本文的研究成果有助于完善计算机化英语分级测试,对其他学科的分级测试也有一定的指导作用。

【Abstract】 The Computer-based College English Placement Test (CCEPT) in Hunan University is a new test that aims to develop a scientific placement system tailored to Hunan University to facilitate the academic success of students in the University, and to meet the requirements of establishing an international research university. The study is a comprehensive validation of CCEPT, and several suggestions are given to the improvement of the test.The study is firmly rooted in the modern paradigm of test validation. The author employs the Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association & National Council on Measurement in Education, 1985) to build arguments. Based on the theory of language testing, particularly that of communicative language testing, this study focuses on the analysis of content validity, construct validity and face validity to evaluate the test to ensure that it reflects the English language performance of the test takers in the light of the test specifications. Quantitative and qualitative forms of data are gathered from test takers. All statistical analysis is performed using SPSS 13.0. Logical analysis is used to evaluate the adequacy with which the test content represents the content domain. Discrimination Index and Facility Value of items are calculated. Correlation study and factor analysis are conducted to test the construct validity. Students’responses to the questionnaire are analyzed to see the face validity of the test.Results show that the validity of CCEPT is convincing. CCEPT has relatively high construct validity. The correlations between each subtest with another subtest and with the total test are generally satisfactory. However correlation between Reading Comprehension and Writing Test is as low as +0.169 which could be due to the marking reliability of the Writing test. The factor analysis shows that the extracted two factors in the test can be named as“ability of listening and writing”and“ability of reading”, and they can explain general test format of CCEPT. However, speaking test is available only to those whose final written score is 050 or above. As a communicative language test, it has not tested all the aspects of English language competence.Analyses of the content validity also show generally satisfactory results. The Facility Value of Listening Comprehension is 0.464, around the ideal value 0.5. However the listening component focuses mainly on specific details rather than on function of utterance, which fails to meet the specifications adequately. Items tested in Reading Comprehension reflect a harmonious balance among different micro-level skills. The overall Facility Value is 0.569 with the lowest 0.465 and the highest 0.708. The discrimination index of this component is 0.27 (around the standard value 0.3). The writing test topics are close enough to everyday life to encourage students to write on. However, study of inter-rater consistency evaluation shows the Scorer Reliability is rather low in the Writing Test.Responses to questionnaires concerning computer familiarity and computer anxiety, general impression of the test, fairness of the test, possible backwash effect, and skills tested in the test and so on, show that most of the test takers find CCEPT satisfactory and they treated it seriously.In the end, the thesis puts forward several suggestions: CCEPT should enlarge item bank in order to ensure the quality and stability of the test. Mode effect should be decreased to the lowest degree. In addition to the analysis of validity, the thesis further points out that CCEPT should improve its reliability, especially scorer reliability in writing scoring.The findings will have important practical implications for implementing a placement English test on computers and they will also be applicable in other academic contexts.

  • 【网络出版投稿人】 湖南大学
  • 【网络出版年期】2009年 01期
  • 【分类号】H319
  • 【下载频次】217
节点文献中: 

本文链接的文献网络图示:

本文的引文网络