节点文献

硬盘故障预测模型在大型数据中心环境下的验证

Hard Disk Failure Prediction Model Validation in Large Data Center Environment

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 贾宇晗李静贾润莹李忠伟王刚刘晓光肖康

【Author】 Jia Yuhan;Li Jing;Jia Runying;Li Zhongwei;Wang Gang;Liu Xiaoguang;Xiao Kang;College of Computer and Control Engineering,Nankai University;Beijing Qihoo Technology Co.Ltd;College of Software,Nankai University;

【机构】 南开大学计算机与控制工程学院北京奇虎科技有限公司南开大学软件学院

【摘要】 随着互联网的发展、存储规模的骤增,大型数据中心硬盘频繁损坏导致的数据丢失给企业带来的损失已成为不可忽视的重大问题.以往基于硬盘SMART(self-monitoring,analysis and reporting technology)属性建立的包括应用统计学和机器学习等方法在内的各种硬盘故障预测模型,虽然取得了较好的效果,但其数据采集及处理等方面均存在不足之处.基于某真实的互联网大型数据中心环境,提取SMART属性数据,并提出了一种基于神经网络权值矩阵的方法,结合Rank Sum秩和检验、RAT反向安排测试、Z-Score评分3种无参统计学方法,对属性进行选择,应用CART决策树及BP神经网络2种机器学习方法,建立硬盘故障预测模型.实验表明描述的2种硬盘故障预测模型均具有很好的性能,这是机器学习算法在实际应用场景下很好的实践.此外,通过实验以及对实验的分析和解释,得出一些有益的结论,这为下一步的研究工作奠定了基础.

【Abstract】 With the surge in the development of the Internet and the scale of storage,frequent damage of large data center disk resulting in data missing and bringing great loss to enterprises has become a major problem that cannot be ignored.Past research build all kinds of hard disk failure prediction models by means of statistics or machine learning based on SMART(self-monitoring,analysis and reporting technology),although it has obtained good performance,its data acquisition and processing exist shortcomings.Based on a large real Internet data center environment,this paper extracts the SMART attribute data and proposes an attribute selection method based on neural network weight matrix,combining with three kinds of non-parametric statistical methods(Rank Sum test,RAT reverse arrangement test,Z-Score)to select useful attributes for building hard disk failure prediction model base on two kinds of machine learning methods(CART decision tree and BP neural network).Experimental results show that the two kinds of hard disk failure prediction models obtain very good performance,which is a very good practice of the machine learning algorithm in actual practical application scenarios.In addition,this paper draws some useful conclusions through experiments as well as the analysis and interpretation of the experiments,which lays the foundation for further research.

【基金】 国家自然科学基金项目(61373018;11301288);教育部新世纪优秀人才支持计划基金项目(NCET130301);中央高校基础科研费基金项目(65141021)
  • 【文献出处】 计算机研究与发展 ,Journal of Computer Research and Development , 编辑部邮箱 ,2015年S2期
  • 【分类号】TP333.35
  • 【下载频次】142
节点文献中: 

本文链接的文献网络图示:

本文的引文网络