节点文献

分布式数据流负载管理技术研究

Research on Load Management Technology in Distributed Data Stream Processing

【作者】 吴再达

【导师】 毛宇光;

【作者基本信息】 南京航空航天大学 , 计算机软件与理论, 2007, 硕士

【摘要】 数据流处理技术是最近几年数据库领域的一个新的研究方向,由于其广泛的应用前景而得到了广大研究人员的关注。分布式系统具有廉价的成本、强大的处理能力,处理速度快、数据量大的数据流具有先天的优势,分布式数据流处理技术一直是数据流研究中的重要组成部分。论文主要就分布式数据流管理系统中的负载管理技术进行研究,主要工作包括以下几个方面:对已有DSMS原型系统的负载管理技术进行了分析,指出了数据流管理系统中负载管理技术的特点和设计原则,改进了Borealis中的负载管理模块,建立了一个实验平台;提出了一种基于算子相关性分析和网络流量分析的负载平衡算法,有效的减少了算子迁移所带来的负面影响;对分布式查询网络中的降载问题进行了研究,分析了节点间的负载依赖性,改进了一种分布式降载策略;研究了已有的滑动窗口连接查询中的降载技术,结合前人一些研究成果的优点,提出了一种基于基本窗口输出频率的滑动窗口连接查询降载算法。本文对负载管理技术的研究,立足于实际应用环境,尽可能减少不必要的假设,为数据流应用中的负载管理问题的进一步研究提供了理论支持和实验借鉴。

【Abstract】 Recently, data stream processing technology has becoming a new research direction in database fields. It wins increasing researchers’attention because of its wide application prospects. Distributed systems, with low cost and powerful processing capability, have inherent advantages in processing fast speed and large volume data streams. Distributed data streams processing technology has been one of the most important parts of data streams research area.In this paper, we study the load management technology in distributed data stream management system, the contributions conclude the following aspects. Analyzes the load manangement technology of existing DSMS prototype system, summarizes the load management technology’s characteristics and design principles, improves the Borealis’s load management module and establishes a system for experiment. Presents a load balance algorithm based on operator correlation analysis and network traffic analysis, moreover, it reduces the negative impact of operator migration. Investigates load shedding issues on the distributed query network, analyzes the load dependence between each node and improves a distributed load shedding strategy. Studies the existing load shedding technology in sliding window join and proposes a sliding window join query load shedding algorithm which based on output frequency of base window. The study of load management technology in this paper is based on the actual application environment with unnecessary assumptions, provides the theory and application support for further study.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络