节点文献

基于BP神经网络的属性选择研究

Research on Feature Selection Based on BP Neural Networks

【作者】 顿煜卿

【导师】 陈利;

【作者基本信息】 华中师范大学 , 计算机应用技术, 2009, 硕士

【摘要】 数据挖掘是一门从大规模的数据中提取有用信息的技术,数据预处理是数据挖掘任务过程中一项重要的环节,特别是挖掘海量高维数据的信息时数据预处理就显得非常重要。因为通常用于数据分析的数据可能包含数以百计的属性,其中很多属性与数据挖掘不相关,因此通过属性选择找出最小的属性集来有效提高数据挖掘的效率就显得格外重要。而分类数据挖掘有很多的挖掘工具,其中之一就是神经网络,其中以BP神经网络最为常用。但现有的神经网络属性选择方法存在不足之处,因为神经网络这种学习型的算法本身的效率就不太高,而如果我们采用数据集全部的属性对神经网络进行训练和裁剪的话,就会使神经网络的网络规模过大,输入的训练信息量过多,网络学习效率低下等等。为了克服神经网络属性选择的缺陷,就必须提出新的方法以对现有的方法加以改进。本文提出一种改进的神经网络属性选择方法,该方法结合了属性选择模型中Wrapper模型和Filter模型的优点,这种方法能有效改善BP神经网络属性选择方法的不足,加快BP神经网络预测的效率,提高网络的分类预测准确率。文中首先用敏感度分析法对初始属性集中的属性进行排序,然后根据属性排序的结果,通过逐一剔除次要属性,来比较在剔除次要属性后BP神经网络预测和分类的准确率,最后通过比较在不同情况下的准确率结果,找到最小最优属性集。最后使用MATLAB进行了相关的仿真实验,比较属性选择前后的神经网络的分类准确度和效率,仿真的结果表明该方法效果良好。

【Abstract】 Data mining is a technology which can abstract useful information from large-scale dataset, data pre-processing is an important link of data mining process, especially in high dimensional data mining. Usually, the data used for data analysis may contain hundreds of features, and many of them are not relevant to data mining. Therefore, it is particularly important to find out the minimum set of features to effectively improve the efficiency of data mining.There are many data mining tools of classification, one of which is neural networks, and BP neural networks used in data mining of classification most commonly. However, there are many defects exist in the methods of neural networks feature selection, because the efficiency of learning algorithm of neural networks is not too high itself, and if we adopt the total features of dataset to train a neural networks, the scale of the network will be very large, the information of the network will be very huge, the studying and predicting efficiency of the network will be bad. In order to overcome the deficiencies of neural networks feature selection, we need to propose a new approach to improve the existing methods.An improved neural networks features selection method is presented in the paper, it combines the advantages of Wrapper model and Filter model, and this approach can improve the defects of BP neural networks, which speed up the prediction efficiency of BP neural networks and enhances the prediction accuracy of networks. It ranks the initial features set by using the method of sensitivity analysis, and then we removes the secondary features according to the features ranking results to compare the accuracy of BP neural network prediction and classification between before and after removing the secondary features, at last we can get the minimum set of feature set by making a comparison of the prediction in different situations .The simulation results which based on the MATLAB tool show the efficiency of this approach.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络