

Evaluation of Scientific Research: Methodology and Application Approach

【摘要】 二战以来,科学技术发展迅猛,已经成为改善人类生活、推动社会进步的重要力量。科学研究领域已经成为世界各国竞争的重点和焦点领域,成为综合国力中决定性和关键性的因素。在这种背景下,科学研究活动已经不再仅仅是科学家“好奇心”驱动的个人行为,而是政府战略发展规划的重要组成部分。随着政府在科学研究领域投入的急剧增长,科研评价已经不只是科学系统自身关注的重要问题,而成为政府和社会共同关注的重要问题。合理有效的科研评价机制对于优化科研资源的配置、激发研究人员的创新潜力、营造科技创新环境以及推进国家科技创新体系的建立和发展都具有重要的意义。科学研究领域的评价活动依赖于合理规范的评价环境和机制,更需要与科学研究的特点相适应的评价方法。本文在总结国内外科研评价的研究和实践的基础上,对科研评价方法展开系统研究。论文强调理论与实践相结合,重点研究具有较高应用价值的科研评价方法,并结合实际数据进行实证研究。论文从科研评价的信息基础出发,将科研评价方法分为三类:基于专家知识的主观评价方法、基于统计数据的客观评价方法和基于系统模型的综合评价方法。论文按照这种分类分别对同行评议方法、德尔菲法、文献计量方法、层次分析法和综合评价法等科研评价方法展开研究。之所以对这些方法展开重点研究,是因为同行评议方法和文献计量方法是科研评价的特征方法,而德尔菲法、层次分析法和综合评价方法也是在科研评价活动中具有较高应用价值的方法。全文共七章,可分为四大部分:第一部分包括引言(第一章)和科研评价概述(第二章)。引言部分阐述了科研评价方法研究的理论价值和现实意义,系统分析了国内外科研评价方法研究和应用的现状和问题,指出了在理论研究和实际应用之间的空白,分析了这种空白存在的原因,并以此为出发点,阐述了本文研究的指导思想,构建了论文研究的内容体系。第二章介绍了科研评价的几个发展阶段,分析了科研评价的几种主要模式,提出了科研评价应该遵循的原则,并在一般意义上阐述了科研评价的数据收集和检验方法以及评价结果的信度检验和效度检验方法。第二部分对同行评议方法(第三章)和德尔菲法(第四章)等基于专家知识的主观评价方法进行了研究。对于同行评议方法,论文结合国内外同行评议的实践情况,讨论了同行评议方法实施的几种主要形式及其主要优缺点;分析了同行专家选择的主要原则;探讨了同行评议实施过程中的规范与约束问题;并对网络环境下的同行评议方法的发展作了展望。对于德尔菲法,在介绍了德尔菲法的特点与实施步骤之后,论文讨论了德尔菲法的问卷设计和数据处理方法,探讨了德尔菲法的变型方法,并比较了德尔菲法和一般专家调查法的优缺点和适用范围,最后,作者结合一个实例详细介绍了德尔菲法的应用过程,并对实例进行了评析。第三部分重点研究了基于统计数据的客观评价方法——文献计量方法(第五章)。作者首先介绍了常用于科研评价实践的文献计量指标及其数据源,然后对文献分布规律、科学生产率和引文分析方法和指标等进行了研究,并对文献计量向信息计量发展的趋势进行了探讨。之后,作者介绍了文献计量方法在科研评价中的主要应用,并提出了几个应该注意的问题:①文献计量方法比较适合宏观和中观评价;②应该注意不同学科的引文差异和评价对象的规模对结果的影响;③要保证数据源的客观性和代表性;④警惕文献计量指标的滥用和对某些指标的片面强调。最后,作者进行了两项实证研究,第一项实证研究提出了学科自引率和自被引率的概念,并将其用于对学科的发展评价,取得了良好的效果;第二项实证研究对我国627所高校的自报科研绩效评估指标和源生科研绩效评估指标的分布规律作了拟合研究,研究结果肯定了国内以前的同类研究的结论,即高校的自报科研绩效评估指标的客观性较差;同时也指出了国内有关研究中的“排序-频度”分布拟合方法的瑕疵与不足,并提出了更科学的科研指标分布拟合方法——“等级-频度”分布拟合,并通过实证分析证明了这种拟合方法的科学性。第四部分研究了基于系统模型的综合评价方法——层次分析法(第六章)和综合评价法(第七章)。实际上,层次分析法也是一种综合评价法,但其独特的建模思想和较高的应用价值使作者觉得有必要对其单列一章进行研究。在第六章,作者详细介绍了层次分析法的基本原理和实施过程,探讨了层次分析法的群组决策方法,并提出了一种简明的群组决策专家定权方法,并通过实证研究进行了验证。在实证研究中,作者通过层次分析法建立了高校科研实力指标体系,验证了层次分析法用于科研评价实践的有效性。对于一般综合评价方法,第七章从指标体系的建立、指标权重的确定、基础指标评价值的确定、评价数据合成模型四个方面对综合评价方法进行了较全面的研究。作者重点研究了指标权重的确定方法,对确定权重的不同方法进行了比较分析,指出了各自的缺陷与不足。最后,作者进行了两项实证研究。第一项实证研究对主成分分析法用于综合评价的可行性进行了研究,在经过严谨的分析后,作者提出了主成分分析法并不适用于综合指标评价的新观点,其原因是对数据的相关性判断并不能代表对指标的价值判断。在第二项实证研究中,作者用综合指标评价法对627所高校的科研实力进行了排序,并在研究中强调了“规模与效率并重”的思想。本论文是武汉大学教改研究项目《高校学科专业评价与调整对策研究》(武大教字[2003]193)和武汉大学社科研究项目《中国高校社会科学竞争力评价研究》(武大科文字[2003]31号)的成果之一。

【Abstract】 After the World War II, the development of science and technology has made great progress. And the science and technology has become the great power to improve the human life and promote the social advancement. Nowadays, the domain of scientific research has become the vital and focal field of competition between countries in the world, and also it’s the crucial and final factor of national power. In this context, not only is the scientific research individual work, but it also is the significant part of national strategic programming. As the government investment in scientific research keeps increasing, the evaluation of scientific research is not only the issue of scientific community itself, but also the concernful issue of government and society. A sound and valid research evaluation system will make a great difference for the optimizing of the allocating of research resources, the incenting scientific researchers, the creating an environment conducive to scientific innovation and the building and development of the national innovation system. The evaluation activities in scientific research area depend on the sound and normative evaluation environment. And moreover, they depend on the evaluation methods adaptive to the characteristics of scientific research. Based on the review of domestic and overseas studies and practice on the methods of scientific research, this dissertation studies the scientific research evaluation methods systemically. The study aims to the combination of theory and practice and highlights the evaluation methods that may have much value in practice. And application studies have been conducted with the practical data.According to the information base of evaluation, there are three kinds of methods for research evaluation. They are the subjective evaluation methods based on the expertise, the objective evaluation methods based on the statistical data and the systemic evaluation methods based on the synthetic model. Among all these three sorts of methods, peer review method, Delphi method, Bibliometric method, AHP method and synthetic evaluation method are highlighted in this dissertation. Considering peer review method and Bibliometric method are characteristic methods for research evaluation, they are the emphases in the study. Delphi method, AHP method and synthetic evaluation method are also widely used in the research evaluation practice, so it is necessary to include them in this study.This dissertation consists of seven chapters, and they fall into four parts:The first part includes the foreword of the dissertation (the 1st chapter) and the introduction to research evaluation (the 2nd chapter). In the foreword of the dissertation, author identified the academic value and practical significance of the study on the methods of research evaluation. Then the status quo and the issue about the domestic and foreign studies on research evaluation methods are reviewed and analyzed. Author retrieved the blank spots between the academic study and practical application about research evaluation methods and the reason that the blank spots engendered. Based on these analyses, the guidelines and the main contents of this dissertation are expounded. In the 2nd chapter, the three successive phases of research evaluation development were reviewed firstly. Then the main patterns and important principles of research evaluation were discussed. Afterward, the methods about data collection and data verifying in research evaluation were expatiated, and the reliability and validity verifying methods were also discussed.The second part aims to study the subjective evaluation methods based on the expertise, especially peer review method (the 3rd chapter) and Delphi method (the 4th chapter). After the review of the application practice of peer review method, author discussed the implemental forms of peer review method, such as Mail-only, Panel-only and Mail + Panel and so on. And the merits and defects of the forms were compared. Hereafter, author discussed the fundamental principles of the selection of peers. Then the criterion and restriction of the peer review was explored and the development of peer review method in the Internet environment was expatiated. When comes to Delphi method, the dissertation introduced the characteristics and the application process of Delphi method in the first place. Then the methods of questionnaire design and data analysis were discussed. Thereafter, author compared the advantage and disadvantage of Delphi method and general expert- investigate methods as well as their application area. In the end of this part, an applied example of Delphi method was expanded and remarked by the author.The third part concerned about the objective evaluation methods based on the statistical data. And author laid a strong emphasis on the study of bibliometric method (the 5th chapter). First of all, the bibliometric indicators that often used in research evaluation activity were studied, and their data sources were also introduced. Then, author studied the methods usually used in the publications distribution research, scientific productivity research and citation analysis, and the trend of development from bibliometric to informetrics was also discussed. Afterwards, the main applications of bibliometric method in research evaluation were studied. And then, author presented four issues that enough attention should be paid to: first, bibliometric method is very fit for macrocosmic and medicosmic evaluation activities, but not very fit for microcosmic evaluation activities; second, good regard should be paid to the difference of citation behavior in different subject, and the influence of the scale of the evaluation object should also be considered; third, the reliability and validity of the data sources should be ensured; fourth, keep cautious to the misusing, misapplying and unilateral emphasis of bibliometric indicators. In the last part of the chapter, two application researches were conducted. In the first application research, author advanced the conceptions of subject self-citing rate and subject self-citation rate and applied them in the subject evaluation. The results based on the practical data proved the method was valid. The second application research studied the distribution of two kinds of research indicators of Chinese university, which are self-reported indicator and source-derived indicator. The research shows that self-reported indicators have less reliability and authenticity than source-derived indicators. This result is consistent with the previous similar study conducted by professor Liang Liming (2000). However, the flaw and demerit of sort order- frequency distribution fit method used by professor Liang Liming was detected in the research, and author advanced a new ranking- frequency distribution fit method. The new method was applied in the study based on the data of 627 universities, and its properness and validity was proved by the result.The fourth part studied the systemic evaluation methods based on the synthetic model, such as Analytic Hierarchy Process (AHP) method (the 6th chapter) and synthetic evaluation method (the 7th chapter) and so on. Actually, AHP method is a kind of synthetic evaluation method, but its unique thought of modeling and wide applications make it necessary to study it in a unique chapter. In the 6th chapter, the fundamental principle and the application process of AHP were introduced firstly. Then author discussed the group decision method of AHP and advanced a concise group decision model. Subsequently, an AHP application research was conducted to establish a university research evaluation system, and group decision model was confirmed in the application research. In the last chapter, synthetic evaluation method was studied at four aspects, which were methods of establishing indicator system, making weights of indicators, measuring basic indicators and synthesizing the data of different indicators. Author highlighted the making weight methods (MWMs) and identified the defects of different MWMs. In the end, two application researches were conducted. The first application research studied the feasibility of Principle Component Analysis (PCA) in research evaluation and found that PCA does not fit for research evaluation. The reason is the correlation estimation of data can not substitute the value judgment of indicators. In the second application research, the research evaluation of Chinese university was conducted, and equal attention was paid to the research efficiency as well as the research scale in the evaluation.

