节点文献

基于机器学习的入侵检测技术研究

Research on Intrusion Detection Based on Machine Learning

【作者】 张义荣

【导师】 王国玉;

【作者基本信息】 国防科学技术大学 , 信息与通信工程, 2005, 博士

【摘要】 入侵检测技术作为动态安全系统(P2DRR)最核心的技术之一,在网络纵深防御体系中起着极为重要的作用,它是静态防护转化为动态防护的关键,也是强制执行安全策略的有力工具。随着网络攻击手段的日益复杂化、多样化和自动化,传统的入侵检测系统(IDS)已不能满足安全需求。为了对付目前越来越频繁出现的分布式、多目标、多阶段的组合式网络攻击和黑客行为,提高在高带宽、大规模网络环境下入侵检测的效率、降低漏报率和缩短检测时间,把先进的机器学习方法引入到IDS中来已成为一种共识。本文的主要工作是将目前几种有生命力的机器学习策略应用于入侵检测技术中,论文从入侵检测的不同视角出发,系统深入地研究了统计学习理论、基于符号的归纳学习理论和遗传学习方法在入侵检测信号分析中的应用技术,并在可能近似正确(PAC)学习框架下,利用计算学习理论和统计假设检验方法对基于不同机器学习策略的入侵检测方法进行了性能比较和评估。在基于统计学习理论的入侵检测研究中,把入侵检测看作是一个模式识别问题,即根据网络流量特征和主机审计记录等观测数据来区分系统的正常行为和异常行为。针对训练样本是未标定的不均衡数据集的情况,把攻击检测问题视为一个孤立点发现或样本密度估计问题,采用了超球面上的One-class SVM算法来处理这类问题;针对有标定的不均衡数据集对于数目较少的那类样本分类错误率较高的情况,引入了加权SVM算法-双v-SVM算法来进行异常检测;进一步,基于1998 DARPA入侵检测评估数据源,把两分类SVM算法推广至多分类SVM算法,并做了多分类SVM算法性能比较实验。在把基于符号的归纳学习理论应用于入侵检测方面,基本思想是把入侵检测视为一个知识表达和规则提取问题。建立在不可区分关系上的粗糙集(Rough set)理论为这一类型的机器学习提供了共同的理论基础。论文详细地研究了基于Rough集知识表达和规则获取的进程正常行为的建模方法,在此基础上,结合统计机器学习理论,提出了一种Rough集约简和支持向量机分类相结合的混合异常检测算法,其基本思想是采用Rough集属性约简的方法压缩数据空间,然后利用v-SVM两分类算法处理约简和正规化后的数据,算法在不损失检测精度的前提下有效缩短了检测时间,更适用于实时入侵检测场合。在基于遗传学习的入侵检测研究中,把机器学习看作一个搜索过程,即入侵检测可视为基于训练样本集,按照既定的搜索策略对入侵规则的搜索或逼近问题。在对遗传算法(GA)实现的相关技术问题,如关键参数选择、操作设计和算法改进等内容深入分析的基础上,论文研究了基于小生境遗传算法的入侵规则自动获取方法,同时给出了相应的异常检测仿真实验结果。然后,结合基于符号的归纳学习理论,提出了一种采用Rough集约简和遗传规则提取的混合检测方法,它利用Rough集约简得到的决策规则集作为GA的初始种群,从而节省了进化代数,提高了检测精度。论文在上述研究的基础上,对基于不同机器学习方法的入侵检测技术进行了性能比较和评估。在可能近似正确学习(PAC)框架下,分析了学习算法的样本复杂度和计算复杂度

【Abstract】 Intrusion detection, one of the most kernel technologies in dynamic security systems (P2DRR), plays a very important role in the deep defense hierarchy system of network, which is the key of the conversion from static defense to dynamic defense, and as well a powerful tool of forcibly implementing the security policy. With the increasing sophistication, diversification and automatization of network attack tricks, traditional intrusion detection systems (IDS) can’t any longer meet the need of security. In order to withstand more and more frequent compound network attacks and hacker commitment of distribution, multiobjective, multistage nowadays, improve intrusion detection efficiency under the circumstance of high band width and large-scale network, decrease false negative rate and shorten detection time, incorporating advanced machine learning techniques into IDS is already a well-known thought.The dissertation mainly aims at applying several active machine learning strategies to intrusion detection and systematically studies signal analysis techniques of intrusion detection based on statistical learning theory (SLT), symbol inductive learning theory and genetic learning method. Meanwhile, performance comparison and evaluation among intrusion detection techniques based on different machine learning strategies are presented according to computational learning theory and statistical hypothesis test methodology.Intrusion detection is regarded as a pattern recognition problem in term of statistical learning theory; i. e., normal behavior and anomaly are distinguished on the basis of observed datum such as network flows and audit records of host. When a training sample set is unlabelled and unbalanced, attack detection is treated as outlier detection or density estimation of samples and one-class SVM of hypersphere can be utilized to solve it. When a training sample set is labelled and unbalanced so that the class with small size will reach a much high error rate of classification, a weighted SVM algorithm, i. e., dual v-SVM, is introduced into anomaly detection. Furthermore, the dissertation extends the binary SVM algorithm into multiclass SVM and illustrates the corresponding performance comparison experiment.Symbol inductive learning theory also has application in intrusion detection and its fundamental idea is considering intrusion detection as the problem of knowledge representation and rule extraction. Rough set theory is founded on indiscernibility relations and the common theory basis of this kind of machine learning. The dissertation explores the modeling approaches of normal behavior of process on the ground of knowledge representation and rule acquisition of Rough set. Besides it, a hybrid anomaly detection algorithm associating reduct of rough set with classification of SVM is proposed. The underlying idea is reducing data dimension in virtue of attribute reduct, then operating reduced and normalized datum using the binary v-SVM algorithm. The algorithm efficiently shortens detection time but not loses detection precision, thus it is more suitable for real-time intrusion detection.Another understanding about intrusion detection is viewing machine learning as a searching process, that is to say, intrusion detection is in essence the searching or approximation issue of intrusion rules in accordance to established searching strategy. After some concerned

  • 【分类号】TP181;TP393.08
  • 【被引频次】16
  • 【下载频次】1775
  • 攻读期成果
节点文献中: