节点文献

基于GPU的数据流处理方法研究

Research on Data Stream Processing Methods Based on GPU

【作者】 卢晓伟

【导师】 周勇;

【作者基本信息】 大连理工大学 , 计算机应用技术, 2010, 硕士

【摘要】 GPU作为一种新型流处理,具备了流处理模型的特点,价格低廉,普及性高,并且拥有强大的并行计算能力和高内存带宽。这种高性能运算能力,已经越来越多地受到各个研究领域学者的重视。数据流作为一种新的数据形态,具有数据快速,连续到达,潜在巨大容量等特点。如何提高数据流处理系统的吞吐能力,提高数据流处理和挖掘算法的实时性成为数据流研究领域的一个重要研究问题。本文重点着眼于图形处理器通用计算在数据流挖掘领域的应用研究,特别是非规则流中高维数据流的高性能处理是本文的一大特点,在理论上提出了一个图形处理器数据流并行计算的通用框架模型,分别从规则流数据和高维数据流两个角度出发,分析数据流处理算法的耗时部分,研究如何将其串行算法移植到GPU上进行运算,提高其性能。针对规则流数据,本文根据三维图像重构的数学模型理论和应用矩阵论进行了电镜三维图像重构的研究,提出了其基于GPU的并行算法,并在GPU的CUDA平台上对规则的投影流数据进行了仿真实验,实验证明了该算法在计算资源受限情况下处理速度可以提高50倍左右,同时保证了图像质量。针对高维数据流,本文提出一种基于GPU的非规则流中高维数据流的处理模型和具体的可行架构,并在该框架下基于统一计算设备架构(CUDA)使用数据立方模型以及降维约简技术并行分析了多条高维数据流的典型相关性。经理论分析和实验证明,该并行处理方法能够在线精确地识别同步滑动窗口模式下高维数据流之间的相关性,相对于纯CPU方法,该方法具有显著的速度优势,很好地满足了高维数据流的实时性需求,可以作为通用的分析方法广泛应用于高维数据流挖掘领域。

【Abstract】 As a kind of novel stream microprocessor, GPU has the characters of stream processing model. Because its inexpensive, widespread, powerful computing horsepower and high bandwidth general purpose parallel computing device, the HPC capability of GPU draws more and more domestic and abroad scholars’attention. As a new type of data, data stream has the properties of fast, continuous arrival, potential immensity capacity. How to increase data stream processing system throughput and real time processing ability of data stream becomes one of key problems in data stream research area.The paper focuses on the application study of General-purpose computing on Graphic Processing Unit on the data streams, especially, in the irrergular streams the HPC processing of high dimensional data streams is the major feature of the paper; presents a GPU data stream parallel computing frame model. Doing the esearch on both regular streams and multiple-dimension stream processing method separately is to analyse the part of time-consuming and try to transfer the serial CPU algorithm to the GPU.According to the regular data streams, based on maths model of 3D image reconstruction and matrix theory, ET 3D image reconstruction algorithm based on GPU is processed, which is simulated on the CUD A platform of GPU to the regular project stream data. The experimental result is that the processing speed is increased by 50 times under the circumstance of resource-constraints, at the same time, the quality of the image is guaranteed.According to the high dimensional data streams, GPU-based processing model and specific and practical architecture for high dimensional data streams in the irregular streams are proposed, meanwhile, based on Compute Unified Device Architecture (CUDA), canonical correlation analysis between two multiple dimensions data streams using data cube pattern and dimensionality-reduction technique is carried out in this framework. The theoretical analysis and experimental results show that the parallel processing method can online detect correlations between multiple dimension data streams accurately in the synchronous sliding window mode. According to the pure CPU method, this method has significant speed advantage, well meet the real-time requirement of high-dimensional data streaming and can be applied to the field of high-dimensional data stream mining widely.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络