节点文献

高中物理试题难度的影响因素研究

A Study on the Factors Affecting the Difficulty of Physics Examination Questions in Senior High School

【作者】 杜明荣

【导师】 廖伯琴;

【作者基本信息】 西南大学 , 课程与教学论, 2008, 博士

【摘要】 考试作为教育评价的一种极为重要的方式,尽管在其发展过程中几经兴衰存废,但其存在与发展的事实却是不容置疑的。目前,我国高等学校入学考试的年参试人数已近1000万。如此大规模的考试牵涉到千家万户,牵动着社会的每一根神经,显然,提高其质量的重要性是不言而喻的。在评价试题质量的诸多指标中,难度是其中极为重要的指标,命题过程中对试题难度的准确把握是提高命题质量的重要保证。然而,实践中对试题难度的估计和调控往往不能尽如人意,究其根本原因还在于对试题难度的影响因素没有充分的把握,在于缺乏对这方面的深入研究,例如物理学科方面,目前还没有针对物理试题难度的影响因素而做的系统研究。因此,从理论到实证系统地研究物理试题难度的影响因素,将对实践中的命题与教学工作提供理论指导,具有重要的理论意义和实践价值。本研究首先以问卷调查的方式,利用开放式问卷调查了师生心目中对于物理试题难易的看法,探讨了在师生心目中有哪些因素在影响着物理试题的难易程度。在问卷调查与文献研究的基础上,提出了影响高中物理试题难度的15条因素假设,它们分别是:阅读量、条件的非充要性、考查知识点的多少、涉及的内容模块个数、物理过程的复杂性、数学过程的复杂性、问题目标的开放性、情境特征的物理建模难度、可猜答得分的概率、提示度、问题情境的新颖性、问题的表达方式、知识点在教学中的地位、分步设问的情况以及背景知识的熟悉性。借鉴美国教育考试服务处(ETS)的研究者在研究试题难度时所使用的量化分析与统计检验的研究方法,本研究根据高中物理试题的具体特点,建立了本研究中提出的15条假设因素的量化赋值标准,并根据此标准对各个因素进行了量化分析与统计检验,从统计规律上探讨了这些因素对试题难度的影响趋势。主要研究结论如下:统计上显著影响高中物理试题难度的因素有4个,分别是考查知识点的多少、物理过程的复杂性、情境特征的物理建模难度以及数学过程的复杂性。这些因素普遍存在于物理试题中,其中对物理试题难度影响程度最大的是情境特征的物理建模难度,接下来依次是物理过程的复杂性、数学过程的复杂性和考查知识点的多少。它们对试题难度的影响趋势是:情境特征的物理建模难度越大,试题的难度也就越大;试题所涉及的物理推理(或物理方程)个数越多,试题越难;试题涉及复杂的数学运算或推理有使试题难度增加的趋势;试题考查的知识点越多,试题的难度通常越大。在量化统计研究的基础上,本研究又借鉴英国剑桥大学考试委员会(UCLES)的研究者在研究试题的情境化效度时所使用的对比测试研究方法,对部分影响因素作了进一步的补充研究,以探讨利用统计方法不能得到的一些规律。主要研究结论如下:在试题的情境特征方面:如果试题呈现给考生的是一个真实的、未经加工或抽象的复杂情境,则由于真实情境中包含的无关信息较多,蕴含的物理模型较为隐蔽,试题的物理建模难度就会增加。多数情况下,情境的熟悉性能够降低物理建模的难度,但从另一方面来说,对于熟悉的情境,考生容易受思维定势的影响,直接套用熟悉的模型解题,而忽略了对新情境的冷静分析,最终导致对试题的错误解答。因此,对于那些情境的表面特征与学生所熟悉的情境相似,而本质特征却悄悄起了变化的试题,情境的表面特征熟悉度越高,对物理建模的误导越大,导致试题的难度越大。考查同一知识点的试题由于情境设置的不同,难度上可能存在很大的差异。在试题的表达方式方面:文字表达的习惯以及文字措辞的选择都可能影响试题的难度。辅助示意图的配置对考生解题可能产生正面影响、负面影响或没有影响。对于情境复杂,时空关系抽象的问题来说,有无辅助示意图对中等水平的考生会产生显著的影响;对于情境清晰,时空关系简单的试题来说,有无辅助示意图对考生不会产生显著的影响;示意图的细节问题处理不当,可能会误导考生选择错误的解题策略,从而影响试题的难度。以准确、简洁、清晰地表达题意为宗旨,命题者可以根据表达需要选择文字的、图表的、或图文并茂的表达方式,此时,不同的表达方式对试题难度没有显著的影响。个别试题中的题设条件多于或少于解题所需要的条件,即题设条件为非充要条件。本研究的对比测试结果表明,物理试题中含有非充要条件时有使试题难度增加的趋势。当题目中含有较多的冗余信息时,考生需要排除众多冗余信息的干扰而提取有用信息,这显然提高了对考生信息提取能力的要求,从而使试题难度增加;当试题中所给信息不足,即试题中没有明确给出解题所需要的全部信息时,要求考生自己根据解题需要补充必要的信息,而这些需要补充的信息可能淹没在考生头脑中储存的大量信息之中,因此,这种情况本质上也是提高了对考生信息提取能力的要求,从而使试题难度增加。个别试题采用开放式的设问方式,使得问题的正确答案不是唯一确定的。本研究的对比测试结果表明,采用开放式的设问方式有使物理试题难度增加的趋势。在题目的设问方面,设置阶梯问题利分步设问是两个概念。如果分步设问的几个小问题之间是互不相关的,本质上只是几个小问题的集合,则与一步设问没有本质上的区别,在统计上对难度没有显著影响。如果分步设问的各个小问题之间相互联系,前面问题的解决能够为后面问题的解决搭建阶梯,即分步设问的各个小问题之间是循序渐进的、阶梯式的,则可以有效帮助考生理清解题思路,从而使试题难度降低。本研究的对比测试结果表明设置阶梯问题有使物理试题难度降低的趋势。最后,针对本研究的结论对专家/教师进行了认同度调查,结果表明,本研究的研究结论得到了绝大多数专家/教师的认同。利用本研究中得到的4个统计显著因素对试题进行量化赋值与统计分析,结果表明,由这4个因素所决定的试题难度等级顺序与教师对试题的估计难度以及试题的实测难度之间的斯皮尔曼等级相关系数非常显著,说明本研究中提出的4个统计显著因素及其量化标准在试题难度预测中有着极高的参考价值。

【Abstract】 As the very important means of education assessment, the examination exists incontestably and develops constantly, though it had rise and fall in its developing process. Now in China, the number of examinees who attend the national matriculation examination is closed to 10,000,000 at one year. Obviously, it’s very important to improve the quality of such a large-scale examination which involves thousands of families and various aspects of society.The difficulty is one of the very important aspects to evaluate the test quality. In the process of setting a paper, to get a full understanding to the difficulty of the test questions is the guarantee to the promotion of the test quality. Nevertheless, the estimation and adjustment to the item difficulty is not so satisfied. The main reason is that we don’t get a sufficient grasp to the affecting factors and lack in-depth research in this field. For instance, there is no systematic study on the affecting factors of physics test items. So to study the affecting factors, from theory analysis to the demonstration, is extremely significant to the test paper designing and the classroom teaching.With the method of questionnaire, this research investigate the students and teachers’ understanding to the difficulty of physics test items, and probe into the affecting factors of item difficulty. On the data of the questionnaire and the former research result, we bring forward the hypothesis of 15 factors affecting test items of physics in senior high school, that is: the number of involved characters, the insufficient or unnecessary resource, the number of involved knowledge points, the number of modules, the complexity of physics processing, the complexity of mathematics processing, the opening extent of the question targets, the physics modeling difficulty of context features, the scoring probability of guess, the degree of hint, the novelty of context, the expression way of question, the status of involved knowledge in daily teaching, the steps of asking and the familiarity of background information.Using the statistical method, which employed by the researchers of Educational Testing Services while they investigate the item difficulty, for reference and bringing the specific features of the test items of physics in senior high school, this research establishes the evaluating criterion of the individual affecting factors, and, based on the criterion, provides a statistical analysis to the 15 hypothesis of affecting factors and statistically discusses the affecting tendency to item difficulty of these factors. The main findings are as the following:Statistically, there are four factors which significantly affect the difficulty of physics test items, i.e., the number of involved knowledge points, the complexity of physics processing, the physics modeling difficulty of context features and the complexity of mathematics processing. These factors generally exist in the physics test items and the very factor which affects most to the difficulty is the physics modeling difficulty of context features. The following factors take turns, i.e., the complexity of physics processing, the complexity of mathematics processing and the number of involved knowledge points. The affecting tendency is, the more difficult the physics modeling of context features, the more difficult the test questions are; the more the number of physics reasoning (or physics equations), the more difficult the test questions are; if the complicated mathematic operation or reasoning the question employed, the question is more difficult; the more the question involved knowledge points are, the more difficult the test questions.On the basis of statistical analysis to the 15 hypothesis of affecting factors, this research also uses the parallel testing method, which employed by the researchers of UCLES while they investigate the validity of contextualized test questions, for reference to give a making-up study to the affecting factors. The main results are:On the features of the questions context: if we present a real, non-artifactitious or abstract context to the examinees, with more non-related information and the more implied model, the physics modeling will become more difficult. In most of the cases, the familiarity of context can reduce the difficulty of physics modeling; but on the other hand, just because the familiar context, the examinees are inclined to the traditional thinking and answer the question with their familiar models, as a result, the examinees neglect the analysis to the new context and give a wrong answer to the question. Hereby, to the questions which surface features of context are familiar with the former but its substantial context has changed, the more familiarity the surface features employed, the more misleading will be to the physics modeling, and then the more difficult the question is. Due to the different contexts to the same question, the difficulty will be likely to vary a lot. On the expression way of the questions: The habit of wording and the diction all possibly affect the difficulty of the test questions. The graphics may facilitate or embarrass or make no difference to examinees’ answering process. To the questions which got a much complicated context and a comparatively nonfigurative space-time relation, the attachment of graphics will markedly facilitate to answering questions for the examinees having secondary ability level. At the same time, if the context and space-time relation are very simple, the graphics don’t work. If we didn’t pay attention to the details of the graphics, it may lead to the examinees’ misunderstanding to the questions and increase the difficulty. With the aim of providing an accurate, brief and much clear expression of the questions, the designer can choose the literal, graphic expression or the combination of these two types to assist examinees’ understanding, in this occasion, the different expression methods have no markedly difference in difficulty of test questions.In some test questions, there is more or less resource than needed. This research found that, in such occasion, the difficulty of questions is inclined to increase. When there is much more redundant information in the question, the examinees need to exclude the redundant and to distill the useful. Obviously, in this condition, the requirement to examinees’ ability to distill information is enhanced and the question difficulty is increased. When there is no enough information for examinees to use, i.e., the designer didn’t provide all the necessary information of the question, it needs the examinees supply the necessary information. However, maybe such information submerged in the large quantities of information in their brains. Consequently it essentially enhances the requirement and increases the question difficulty.Some questions are employed opening-answer way, so the correct answers are not sole. This research result assumes that the opening-answer questions are likely to get the tendency of increasing the difficulty.While presenting the target of a question, designing the ladders for answering is different with asking in steps. If the sub-questions in asking steps are not related with each other, then there is no essential difference between this and the one-step question. Statistically it doesn’t affect the question difficulty. Contrastingly, if the sub-questions in asking steps are related with each other, and the answer of the previous sub-question can give a clue to the answer of the next one, i.e., the sub-questions are in proper sequence or ladder-like, it can help examinees understand the question and decrease the difficulty. The result of parallel testing research shows that designing ladder-like sub-questions is likely to have the tendency of reducing the difficulty.Finally, an identification survey was carried out to the experts and teachers. The results indicate: Most of the experts/teachers agree with the conclusions drawn from this research. Using the four significant affecting factors to analyze the examination papers, the result indicate that the sequence of the questions difficulty scale which determined by these four affecting factors has a significant correlation with the teachers’ forecasting difficulty and the real testing difficulty of the test items. It indicates that the four affecting factors and their measure criterions proposed by this research employ a significant meaning in the forecasting of test items difficulty.

  • 【网络出版投稿人】 西南大学
  • 【网络出版年期】2008年 09期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络