

Data Mining Application in E-government OA System

【作者】 袁峻

【导师】 曾大聃;

【作者基本信息】 华东师范大学 , 软件工程, 2010, 硕士

【摘要】 电子政务近年来在中国有了很大发展,各级政府部门建立了大量的数据库,数据呈指数级增长。如何利用新的数据分析技术高效、准确地从电子政务系统中提取有用的信息成为了一个有现实意义的问题。本文将使用数据挖掘技术对市级电子政务办公系统iGRP进行数据分析,目的是通过对iGRP电子政务办公系统进行数据挖掘来发现影响用户活跃度的属性。在数据分析过程中首先根据分析目的选择合适的目标属性和预测属性;然后从iGRP数据库中抽取、集成、清洗所选择的目标属性和预测属性;接着对数值型预测属性进行噪声处理和离散化处理;接下来使用ODM(Oracle Data Mining)的“属性重要度”功能对目标属性及其相关预测属性进行属性重要度分析,将无关的预测属性排除,以达到减少数据维度的目的;之后,对目标属性及其相关的预测属性使用ODM的O-Cluster算法进行聚类分析,为数值型目标属性找到一个合适的分裂点,根据这个分裂点将目标属性转化为二元属性;最后,使用ODM的决策树算法对目标属性进行分类挖掘并进行测试评估。本文从某市iGRP电子政务系统5个数据库中抽取了7827条数据,包含30个预测属性和2个目标属性。按上述方法对该数据集进行数据挖掘后得出如下结论:对用户活跃度影响最大的属性是“收藏数量”,其次是“发文员”和“收文员”角色。根据此结论,应进一步了解用户对“收藏文件夹”这个功能模块的需求和使用反馈,以便改进提高此功能,为用户提供更好的服务。其次,在用户培训和用户反馈调查中应更加关注具有“发文员”和“收文员”角色的用户。本文使用数据挖掘技术对真实的电子政务系统数据进行了数据分析,实现了对海量数据的高效、准确分析,为改进iGRP产品及提高用户满意度提供了依据。

【Abstract】 E-government in China has developed greatly in recent years. A large number of databases have been established in all levels of government departments.And data grows exponentially.It has become a problem of practical significane to efficiently and accurately extract useful information from E-government system with new data analysis techniques.IGRP, a kind of municipal E-government OA system will be analyzed with data mining techniques in this article.The target is to find the attributes that impact user activity by data mining in iGRP E-government OA system.Firstly,appropriate target attributes and predictor attributes are selected based on the target in the procedure of data analysis. Secondly, the selected target attributes and predictor attributes are extracted,integrated and cleaned from databases of iGRP system. Thirdly, numeric predictor attributes are noise processed and discretized. Fourthly, the target attributes and related predictor attributes are analyzed by "Attribute Importance" function of ODM (Oracle Data Mining).And unrelated attributes are excluded to reduce the data dimensions. Fifthly, the target attributes and related predictor attributes are analyzed by O-Cluster algorithm and an appropriate point is found to split the target attributes to binary attributes. At last, the target attributes are classified and estimated by decision-tree.7,827 cases, including 30 predictor attributes and 2 target attributes, are extracted from 5 databases of a municipal E-government OA system in this article. The result of data mining in the dataset is concluded as follows.The greatest impact on user activity attribute is "Total Favorites".The second is role attribute of "person responsible for sending official documents" and role "person responsible for receiving official documents".Based on the conclusion, user requirements and feedback of the "Favorite Folder" function module should be learned more in order to improve the function and provide better services for users. Secondly, users with role of "person responsible for sending official documents" or "person responsible for receiving official documents" should be paid more attention in user training and user feedback survey.In this article,data mining techniques are applied in the real data of E-government sysem. It is relized to efficiently and accurately analyz massive data. And the basis is provided to improve the product of iGRP and increase customer satisfaction.

【关键词】 电子政务数据挖掘决策树聚类
【Key words】 E-governmentData MiningDecision TreeClustering
  • 【分类号】TP311.13
  • 【被引频次】2
  • 【下载频次】259