

Research on Quality Control of the Data in Web Survey

【作者】 樊茗玥

【导师】 赵喜仓;

【作者基本信息】 江苏大学 , 管理科学与工程, 2011, 博士

【摘要】 网络调查是现代网络技术和传统调查技术相结合的产物。随着互联网的飞速发展和网络普及程度的不断提高,网络调查实施得越来越广泛。与传统调查相比,网络调查在组织实施、信息采集、信息处理、调查效果等方面具有鲜明的优势,但也正是由于网络的特性使得网络调查存在独特的缺陷,如网络的覆盖率、网络的低控制性、网络的开放性以及网络的安全性等问题。这些问题成为控制网络调查数据质量的障碍。本研究从数据误差产生机理深入研究网络调查的数据质量,以全面质量管理理论、优化控制理论和数据误差理论为基础,界定网络调查数据质量等相关概念;提出了网络调查数据质量控制理论;构建了以网络调查数据的内生质量、传递质量及控制质量影响因素为二级指标的网络调查数据质量影响因素指标体系;对网络调查数据质量影响因素进行分析;分析表明,网络调查数据误差是网络调查数据质量的主要影响因素。在对网络调查数据误差效应进行全面分析后,分别引入加权调整法、二级抽样法、热卡插补法以及随机化回答模型等国际上先进的误差修正技术并加以改进,从控制网络调查数据质量的微观层面,对由网络特性导致的网络调查数据误差进行研究,从数据质量管控的方法层面寻找有效控制网络调查数据质量的方法。在理论分析的基础上,经过强假设,构建仿真样本集,设计网络调查无回答误差修正与计量误差修正仿真流程,利用S-Plus和SPSS统计软件设计并实施仿真程序,验证各误差修正技术的可行性和有效性。研究认为,推动调查组织的全面参与、提升网民的总体素质、设计科学的调查方案、增强网络调查的可信任程度、加强网络调查的过程监控、采用必要的数据修正技术、纳入丰富的先验辅助信息、明确适合的网络调查范围并以混合方式辅助实施调查、采用多学科交叉的技术与方法等是网络调查数据质量控制的有效途径。本研究的创新之处主要有:第一,科学界定了网络调查数据质量以及网络调查数据质量控制的概念,提出了网络调查数据质量控制理论,构建了网络调查数据质量影响因素指标体系,拓展了网络调查数据质量的研究领域。第二,系统研究了控制网络调查数据质量的误差修正技术,在全面分析网络调查数据质量误差效应的基础上,引入数据误差领域的研究成果,并适当改进之以符合网络调查特征,分别对网络调查的覆盖误差、抽样误差、无回答误差和计量误差进行修正,从技术层面控制网络调查工作中误差因素带来的数据质量问题,其中包括参与调查的“人”的问题,调查本身的设计问题以及数据搜集等问题。提出可操作的数据质量保障方法。第三,有效验证了网络调查误差修正的可行性,在合适的网络调查数据误差仿真样本集的基础上,设计网络调查无回答误差修正与计量问题修正仿真步骤,利用S-Plus和SPSS软件设计并实施仿真程序,从误差角度实现控制网络调查数据质量的技术,开拓了现有理论和方法的应用领域,提高了研究的深度和精度。

【Abstract】 Web survey combines modern network technology with traditional survey techniques. With the rapid development of the Internet and the improvement of network popularization, this kind of survey is applied more and more widely. Compared with the traditional survey, web surveys has many advantages in the survey organization, information collection, information processing and investigate results. However, because of the characteristics of the network, web survey still have some disadvantages, such as low coverage, low controlling, high openness and low security of Internet. These problems become the obstacles to stop us controlling data quality in the web survey.In this paper, we begin with the data error to research the data quality of the web survey. Based on the theories of Total Quality Management, Optimal Control and Statistic Error, we define some conceptsaboutdata quality in web survey.We also construct the index system of influence factors of the web survey using the definition of data quality of web survey, which are endogenous quality, transmission quality and control qualityof the web survey data. After analyzing the error effects of the web survey data, we use some error control techniques to solve the error problems in web survey. And try to find some effective ways to control the sampling error, coverage error, nonresponse error and measurement error. Research was found that using the error adjustment methods, we can ease the participants who may refuse to be investigated and also the design of the investigation itself. With the same purpose, these good methods can help reducing the problems of data collection and administration, legal supervision and web moral in web survey. Based on the theoretical analysis, after strong hypothesis, we use the data as sample sets of the simulation from Leiniz Institute for the Social Sciences, Germany. This investigate was finished in July,2011 named social inequality. Then, we design the simulation process of nonresponse and measurement error correction. Use the statistical software of S-Plus and SPSS to design and implementation of the simulation program and test the effectiveness of the methods we study above.Finally, there are some advisements for how to control the data quality in web survey. Such as advising the survey organization should stay during the process of investigation, promoting the web users’quality, designing survey scheme scientificly, enhancingthe credibility of web survey, strengthening the process control of web survey, supplying enough prior auxiliary information, making clear about the survey coverage, using mixed survey way to carry out investigation, anddealing with interdisciplinary technique and method for the data quality of web survey.The innovation of this dissertationincludes:First, define scientificly about what is data quality of the web survey and control the data quality of the web survey. Then construct the index system about the quality factors of the data in the web survey. According to this index system, study which fators are more important to influence the data quality in the web survey.Second, study some useful method to control the data quality in the web survey. These methods are not only using statistical area of research achievements, but also joining the network characteristics elements. Sowe can reduce the data error made by "people", questionair designing and data collection.Finally, use simulation way to check the effectiveness of the correction of the data errors in the web survey. Then find the proper data sets and design the processes of the nonresponse error and measurement error of the web survey. Using S-Plus and SPSS software to design and implementation of the simulation program, then gives us some conclutions. This developes the theory and the application with the existing methods, and improve the new field of study data quality of the web survey.

  • 【网络出版投稿人】 江苏大学
  • 【网络出版年期】2012年 06期