节点文献

基于探测和仿真的P2P用户和网络行为分析建模及安全性研究

Research on P2P Network and User Behaviors Modeling and Analysis Based on Measurement and Simulation

【作者】 贾晋康

【导师】 陈常嘉;

【作者基本信息】 北京交通大学 , 通信与信息系统, 2009, 博士

【摘要】 近年来,P2P技术作为一种网络计算机之间的资源整合和分配技术,伴随着网络接入带宽的爆炸式增长和互联网新应用的蓬勃发展,无论在商业运营、学术研究、还是社会道德领域都掀起了巨大的波澜。这种全新的网络架构,试图利用接入网络中的每一台计算机的空闲能力(包括空闲的CPU时间、存储空间、网络带宽等),通过彼此的帮助和合作,在突破了传统网络中性能瓶颈的同时,也能够高效的、可靠的、安全的完成传统网络阶段所面临的许多“不可能完成的任务”。同时,坐在计算机之前的广大网民,以多种多样的P2P应用为纽带,在虚拟世界建立起信息交互和资源共享的平台,使得冰冷的机器被赋予了某种“社会性”。充分掌握当前各种P2P应用的发展现状,挖掘P2P用户的行为模式,了解协议设计对于覆盖层网络特征的影响,发现目前P2P网络中存在的安全问题,不仅能够为P2P网络的设计改进、协议完善提供依据和方向,而且能够为商业运营领域的P2P流量评估监控、管理维护提供方法论基础。本文中,基于互联网中实际采集的P2P应用数据,并结合模拟P2P网络机制和特征的仿真程序,通我们过数据挖掘分析和理论建模,主要对当前最流行的两种类型的P2P系统(文件共享系统和直播流媒体系统)进行了一系列的研究:·通过采自BitTorrent(简称BT)系统中心服务器的用户数据,我们对该系统内的用户特征进行了提取和建模。测量发现,无论用户发布行为还是下载行为,其分布均存在类似于“至多只取一次”的“头部平坦”特征,基于两种行为特征的成因不同,我们提出了“非直接”选择原理来解释用户发布行为中出现的这类现象。此外,我们还揭示了文件寿命与用户兴趣度方面的关联特征,以及用户在文件共享过程中表现出的长期共享特征等。·通过自行编写爬虫程序,我们对以PPStream(简称PPS)为代表的直播流媒体系统中的大量用户进行了实时的跟踪和监测,与其它类型的P2P应用相比,由于流媒体系统播放的实时性要求,系统设计的独特性会导致系统中的普通用户呈现出一定的独特性。通过实际数据,我们试图揭示PPS系统中的用户在地理分布、连接稳定性、用户加入离开模式、共享特征以及拓扑构造等方面的特征。结果表明,系统设计通过牺牲公平性来换取效率,系统整体共享效率较高,基本能够满足多数用户播放的实时性要求。·由于P2P系统中用户具有一定的自治性和社会性,所以,如何设计相应的激励机制来促进用户更多的参与网络共享,是目前一些系统在设计实现时需要考虑的问题。激励机制的不同,所造成的直接结果就是用户的组织方式的差异,最直接的表现在覆盖层网络拓扑的构造上面。通过模拟BT系统和CoolStreaming系统的连接管理机制,我们编写仿真程序,揭示了这两种不同类型的P2P系统在拓扑结构方面的一些特征。·在P2P直播流媒体系统中,用户缓冲区的建立和更新特征,直接影响着系统整体的共享效率。通过理论推导,我们研究了不同用户启动起始点的选择对于系统用户播放延迟分布的影响特性。测量发现,用户在选择演播起始点时,一般会遵守基于“线性放置”的策略,这种策略的采用会直接影响到用户的播放进度分布。此外,我们还揭示了用户延迟与系统重要设计参数的关系,通过改进设计参数,系统的性能能够进一步得到提升。·P2P系统的安全问题也是P2P网络设计中需要重点考虑的问题。在基于数据分片传输的流媒体播送系统的场景之下,如果不采取任何的防御措施,几个“恶意污染者”相互合作,就能够导致多数用户播放出错,甚至引起整个系统的瘫痪。我们提出一个校验机制,利用不同数据分片之间的互相关特性,通过在普通用户的多个“邻居”之间建立监督网络,有效的识别和及时切断污染源。仿真显示该机制能够有效防止污染数据片的扩散。该机制开销很小,且非常容易实施,可以投入实用。

【Abstract】 Recently,with the explosive increase of the access bandwidth and the new applications in Internet,as the tool for idle resource integration and reallocation, Peer-to-Peer technology influenced us very much not only in commercial and industries field,but also in the academic and social field.The new-born infrastructure intends to make full use of spare capacity(including the idle CPU cycles,disk spaces,and bandwidths, etc) of each PCs which have access to the Internet.By mutual cooperation and assistance,these PCs can break through the performance bottlenecks of traditional infrastructure and deal with the "mission impossible" which once faced by traditional Internet efficiently,reliably,and safely.At the same time,the surfers,who sit in front of these PCs,can establish the information exchange and resource sharing platform in the virtual world connected by all kinds of P2P applications.That is,these cold PCs can gradually gain some social property under the control of humans.Mastering the situation of various P2P applications,mining the behavior model of P2P users,understanding the network characteristics incurred by protocol implementation,discovering the security problems,they all contribute to improve the design principle and protocols of P2P systems.In addition,they will also provide the methodology for P2P flow monitoring and maintenance.In the thesis,based on the P2P datasets captured in Internet and simulation programs, we have done some research on two of the most popular P2P applications:file sharing systems and live media streaming systems.●By collecting the users profiles of BitTorrent,we retrieved and modeled the users behavior in the system.It’s revealed there exist obvious flat-head phenomenon which is similar to "fetch-at-most-once" behaviors no matter for publishing and for downloading activities.We proposed the "Indirect Selection Principle" to explain the characteristics in users issuing behaviors.Furthermore,we also present the relationship between the file life span and the users’ interests,and the sharing properties of users.●After developing our crawler,we traced and monitored most users of PPStream system in real time.Due to the rigid real-time requirements,the unique policies implemented in the system will definitely result in the unique characteristics of users.Based on the datasets,we tried to reveal users behaviors on geographical distribution,connection stability,users arrival/departure pattern,sharing and topology.It’s proved the system pay fairness for efficiency. ●Because of the autonomous and social property of users,it’s a vital concern how to design the incentive mechanisms to promote users share more in the system. Different mechanisms will incur the different organization patterns among users, and the topology varies consequently.By simulating the connection policies in BitTorrent and CoolStreaming Systems respectively,we tried to disclose some characteristics on topology of them.●In P2P live system,the buffer establishing and update paces will affect the sharng ratio of the whole system.From our measurement,we find out the "linear placement" mechanism adopted in PPLive System,and we think the strategy will influence the playback process directly.By rigid theoretic analysis,it’s revealed the relationship between playback latency and some vital design parameters.By optimizing these parameters,the performance will be increased further.●Security problems is of vital importance in P2P systems.In the chunk-based live streaming systems,several notorious "polluters" will destroy the whole systems by inserting the fake chunks to the system.By utilizing the relationship among different data chunks,we propose a mesh-based mechanism to prevent the peers from polluted.The mechanism is lightweight and easy to be implemented.

节点文献中: