节点文献

基于云计算的病毒恶意软件分析研究

Research on Virus and Malwares Analysis Based on Cloud Computing

【作者】 孟超

【导师】 孙知信;

【作者基本信息】 南京航空航天大学 , 计算机应用技术, 2013, 博士

【摘要】 目前反病毒软件仍然是广泛使用的检测病毒恶意软件的工具。然而传统检测方法的有效性一直被广泛的质疑。由于传统的检测方法不能够对新病毒和恶意软件进行有效的查杀,同时其逐步提高的复杂性也导致了其本身也容易被恶意软件攻击。云计算的出现改变了这一现状。云计算是分布式计算,网格计算,效用计算,虚拟化技术等计算机技术和网络技术发展融合的产物。它聚集大量计算机资源,通过互联网向普通用户提供各种IT服务,并按照使用量进行付费的一种模式。云计算可以向终端用户提供安全性的服务。云安全服务是指:使用大量的客户端对网络中软件行为进行异常的监测,得到网络中木马、蠕虫等病毒恶意软件最新的信息,然后将这些信息发送到云服务器端进行自动分析和处理,最后将这些病毒恶意软件的解决方案发到每一个客户端。本文将基于云计算的检测恶意软件病毒的方法同机器学习中算法分析理论相结合,利用了一种新式的分布式的CFO算法。该算法类似于粒子群算法是新型的基于天体物理学的多维搜索启发式算法,具有确定性的特点,利用一组质子在万有引力下的运动,在决策空间搜索最优解,而这组质子依据万有引力规则在空间移动。在本论文中对该算法进行了收敛性和正确性的证明,使得该算法的应用有了可靠地理论基础。然后将该算法进一步进行改进提出了分布式的CFO算法。由于该算法确定性的特点,所以该算法特别适合于训练神经网络分类问题。本文在云计算分布式环境中训练集成神经网络作为静态行为模式分类器,利用集成神经网络将可疑病毒文件和正常的可执行文件分类。另外,本论文通过最大独立集优化算法选择云中的虚拟机结点,安装商用杀毒软件实现并行分布式的对可疑文件进行全面的检测分析。同时利用云计算虚拟机结点的封闭式环境对可实现对病毒恶意软件的动态行为监测,在虚拟机封闭式的环境下观察其系统调用的行为,确定是否是病毒恶意软件。采用分布式波动PIF算法来形式化描述动态分析和分析报告返回的过程。依据分析的环境对波动算法进行相应的改进,也进一步提升了分析检测的效率。基于云计算平台的特点,本论文提出了云中病毒恶意软件检测模型的实施方案。与传统的病毒检测方法不同,该模型在每一个客户端主机运行一个轻型的主机代理程序,获取进入系统的可疑文件,发送这些文件到云端进行检测分析,然后根据返回的分析报告决定是运行还是隔离。在云端,该模型充分利用云计算分布式并行计算的特点,采用最大独立集算法优化云网络结构,选择出分布式虚拟机结点,在结点上分别安装不同的商用病毒恶意软件的检测引擎,用多个不同检测引擎对病毒恶意软件进行分布式的并行检测分析,最后的分析报告是综合了各个检测引擎的分析结果,发送给客户端主机代理软件。同时在基于云计算病毒恶意软件检测分析的服务中,对新的可疑文件还提供了两个行为分析引擎——动态行为分析引擎和静态行为分析引擎。目前只能在单机系统中对病毒恶意软件动态的分析一条程序执行的路径,误报率很高。为此,本文提出了基于云计算的动态行为分析方案,该方案利用云计算分布式计算的特点,在云计算多个虚拟机结点上并行的完成对病毒恶意软件多条执行路径的分析,对虚拟机中系统调用的监控发现病毒恶意软件在特定的条件下触发的恶意行为。采用PIF算法来形式化的描述可疑文件分析和报告返回的过程,对该算法的改进也同时提高了分析的效率,PIF算法是分布式算法特别适合在云计算环境中执行。实验结果表明,该模型能够检测出病毒恶意软件的条件触发行为,并且可以发现触发恶意行为的条件和满足这些条件的输入数据,同时基于云计算的动态监测的性能比普通单机系统有了较大的提升。当前几乎所有的对恶意代码的静态检测都是采用基于签名数据库的方式。这种方式导致了病毒恶意软件可以使用一些比较简单的方式来躲避检测,比如代码迷惑方式。针对这种情况,本文深入研究了集成神经网络作为模式识别器在病毒恶意软件静态检测中的应用,提出了一种基于云计算的静态行为检测的方式。与传统通过动态执行方法去获取系统调用序列不同,该方法通过基于n-grams的特征提取方法得到系统调用序列,使用特征提取和选择的算法得到可疑文件特征向量,作为训练测试的输入数据。通过对一种新式CFO算法进行比较详细的理论分析研究,在此基础上提出了一种新型的分布式的CFO算法,用于在云计算分布式计算环境中,对集成神经网络进行训练学习。文中最后基于云计算环境实现了集成神经网络对病毒恶意软件进行测试分类。实验结果表明,该方案与传统的静态检测方法比较有较高的精确度和较少的错误率。

【Abstract】 Currently,antivirus software is one of the most widely used tools for detecting and stoppingmalicious and unwanted files.However, the long term effectiveness of traditional hostbased antivirusis questionable. Antivirus software fails to detect many modern threats and its increasing complexityhas resulted in vulnerabilities that are being exploited by malware. The emergence of Cloudcomputing changes present situation.Cloud computing is the development product of distributed computing、parallel computingand utility computing. It congregates large numbers of computation resources and provideson-demand IT services to the remote Internet users.Cloud can provide security services. CloudSecurity services: a large number of client ends monitor software actions,acquiring information aboutmalware and malicious codes,and sending it to cloud in order to analysis and processautomatically.Finally, the solution is allocated to all the client ends.This paper combines virus and malware detection based on cloud computing with algorithmanalysis theory in machine learning and uses a new Central Force Optimization algorithm.Thealgorithm is a new deterministic multi-dimensional search metaheuristic based on the metaphor ofgravitational kinematics. CFO is a deterministic algorithm that explores a decision space by “flying” agroup of “probes whose trajectories are governed by Newton’s laws.This paper prove the correctionand the convergence of CFO algorithm. The applications of CFO have reliable basis oftheory.Then,the algorithm is improved further.A distributed CFO is proposed.Because of deterministiccharacteristic, the algorithm is fit for training of neural network for classification problem.This papertrains the neural network ensemble as the pattern classifier of static behavioral analysis and uses theneural network ensemble to classify the suspicious files.This paper uses Maximal Independent Set algorithm to select virtual machine nodes and installsthe anti-virus software in the node to implement parallel distributed analysis. Meanwhile,the enclosedenvironment in virtual machine nodes is used to monitor dynamic behavior in order to identify virusand malware.The distributed Propagation of Information with Feedback (PIF) protocol algorithm isused to formally describe the procedure of dynamic analysis and analysis report return. According toanalysis environment,this paper improves the PIF and advances the analysis efficiency.On the basis of characteristic of cloud computing,this paper advocates a model for malwaredetection on end hosts based on providing antivirus as an in-cloud network service. We suggest that each end host run a lightweight process to acquire executables entering a system, send them into thenetwork for analysis, and then run or quarantine them based on a threat report returned by the networkservice.In the cloud network,this model,which uses maximal independent dominating set algorithm tooptimize the structure of network and to select the distributed virtual machine nodes, in whichmultiple commercial analysis engine are installed,enables identification of malicious and unwantedsoftware by multiple, heterogeneous detection engines in parallel. Furthermore, in networkcloud,there are two behavioral analysis engines which is dynamic analysis engine and static analysisengine.Virus and malware analysis is the process of determining the purpose and functionality of a givenvirus sample.Currently, The problem of dynamic analysis tools is that only a single program executionis observed,error rate is high. we propose a system,using cloud computer lots of resources,that allowsus to explore multiple execution paths and identify malicious actions that are executed only whencertain conditions are met. The distributed Propagation of Information with Feedback (PIF)protocol algorithm is used to describe analysis process.The improvement of the PIF advances analysisefficiency. The PIF is the distributed algorithm so that it is fit for cloud environment.Our experimentresults show that in many cases we can: detect the existence of trigger-based behavior, find theconditions that trigger such hidden behavior, and find inputs that satisfy those conditions and advanceits performance.Currently almost all static methods for detecting malicious code are signature-based, this leadsthe result that viruses can easily escape detection by simple mechanisms such as code obfuscation.This dissertation researchs the problem of neural network ensemble and application in static detection.Therefore,based on the cloud,a behavior-based detection approach is proposed to address this problem.Unlike the traditional approach, this approach statically analyzes binary code to derive system callsequences based on n-grams.In this dissertation,the author analyzes CFO convergence throughmathematics analysis of Celestial Mechanics.Based on it, distributed Central Force Optimizationalgorithm is proposed in order to train the ensemble neural network.Finally, This dissertationimplement classification of executables. The experimental results show that the proposed approachhas higher accuracy and a lower false positive rate than the other detection approach.

节点文献中: