节点文献

基于可编程图形硬件加速的若干技术研究

Relevant Technology Study on Programmable Graphics Hardware

【作者】 董朝

【导师】 彭群生; 陈为;

【作者基本信息】 浙江大学 , 计算机应用技术, 2005, 硕士

【摘要】 目前图形硬件中的图形处理器(GPU)计算能力的增长速度已经超过了中央处理器(CPU)计算能力的增长速度,主流图形硬件制造商声称,现在每隔12个月GPU的性能就会增长一倍。图形硬件技术一个最主要的突破就是在图形硬件中引入了可编程功能,此功能允许用户编制自定义的着色器程序(Shader program)来替换原来固定流水线中的某些功能模块,使得GPU在功能上更像一个通用处理器。虽然GPU具有非常高的计算速度,但并不能直接将以前在CPU中实现的算法照搬到GPU中来执行,这是因为GPU的指令执行方式和CPU不一样,GPU的体系结构是一种高度并行的单指令多数据(SIMD)指令执行体系。所以要基于可编程图形硬件实现一些在CPU中效率较低的算法,就必须重新组织算法实现的数据结构和步骤,以充分利用GPU并行处理体系结构带来的性能优势。本文中的几种算法都基于可编程图形硬件实现,在达到实时效率的同时保证了结果的质量。 本文中的研究工作主要包括以下几个方面: 1.实时体素化及其应用 提出了一种面向复杂几何模型的高效体素化方法。算法首先将几何模型依据各面片的朝向将它们分别变换到三个离散的体空间,然后将每个体空间中生成的体素以二维纹理的方式存储在三张工作表格(worksheet)中,三张工作表格最终合并成为一张包含全部体模型数据的工作表格。算法整个运行过程中只需要遍历初始几何模型一次。由于整个运行过程全部在GPU中实现,对于两百万面片数的几何模型算法能够达到实时。该算法实现简单并且易于扩展到体建模、透明绘制、碰撞检测等许多具体应用中。 2.大尺寸点模型实时高质量绘制 提出了一种大尺寸点模型的自适应绘制算法。该算法在预处理阶段首先将点模型分割为很多点片,建立每一个点片的层次结构并以线性二叉树的方式保存;在接下来的绘制过程中对点模型分片进行处理,通过快速的可见性测试剔除掉不可见的点片,可见的点片则会依据距离视点的远近选取合适的绘制模式在GPU中实时绘制。算法不仅充分发挥了GPU的性能并且有效地均衡了GPU和CPU之间的负载。为解决大尺寸模型数据量过大的问题,我们还提出了一种快速的压缩/解压缩技术,可以将显存中的绘制数据压缩8倍以上。基于以上算法,可以在普通PC平台上实现百万数量级浙江大学硕士学位论文摘要点采样模型的实时高质量绘制。3.实时阴影映射 阴影映射是一种基于图像空间的阴影绘制算法。该算法基于图形硬件提供的纹理(t exture)和深度缓存(dePth buffer)等技术实现,依靠GPU加速可以达到很高的绘制效率。文中会详细介绍两种实时阴影映射的实现方法:普通基于GPU实现的阴影映射和硬件阴影映射。 在本文的最后,作者总结了自己关于可编程图形硬件技术的一些经验和体会,并提出了一些未来的研究方向。关键词:GPU;可编程图形硬件;体素化;实时绘制;点绘制;阴影若

【Abstract】 The computation power of the Graphics Processing Unit (GPU) in current commodity graphics hardware is increasing at a much faster rate than that of the Central Processing Unit (CPU) in computer systems. The projected time to double in efficiency for the GPU is quoted to be roughly 12 months by the leading graphics card manufacturers. A recent major breakthrough in graphics hardware technology has been the introduction of programmability; this allows the user to replace portions of the fixed graphics pipeline with customized shader programs exposing the ability of GPU to function more like a general processing unit. In spite of all the rendering power, it is not possible or meaningful to use algorithms designed with CPU in mind on graphics hardware. The essential difference is that GPU provides a highly parallel Single Instruction Multiple Data Set (SIMD) architecture. The key to harnessing this resource is reengineering the computationally expensive algorithms to take advantage of this architecture as well as making use of rendering optimizations built into the programmable graphics pipeline. This thesis presents several novel graphics approaches which utilize programmable graphics hardware to obtain both real-time frame rate performance and high quality result.Our research works in this thesis mainly focus on the following aspects:1. Real-time Voxelization for Complex ModelsWe present an efficient voxelization algorithm for complex polygonal models by exploiting newest programmable graphics hardware. We first convert the model into three discrete voxel spaces according to its surface orientation. The resultant voxels are encoded as 2D textures and stored in three intermediate sheet buffers called directional sheet buffers. These buffers are finally synthesized into one worksheet, which records the volumetric representation of the target. The whole algorithm traverses the geometric model only once and is accomplished entirely in GPU, achieving real-time frame rate for models with up to 2 million triangles. The algorithm is simple to implement and can be integrated easily into diverse applications such as volume based modeling, transparent rendering and collision detection.2. High Quality Real-time Rendering of Large Scale Point ModelHere we introduce an adaptive rendering algorithm for large scale point models. The algorithm first subdivide the target model into multiple patches in preprocess. A hierarchical structure is built for each patch and then converted into a linear binary tree. During rendering, the model is processed patch by patch. Fast visibility decision is made to cull invisible patches. Visible patches are displayed in GPU by choosing appropriate rendering mode, i.e, a distance-dependent strategy. Our algorithm takes full advantage of GPU and effectively balances the workload between CPU and GPU. We also propose a fast compression/decompression technique which achieves 8 times compression ratio. The results demonstrate high performance and image quality rendering for large scale point models in consumer PC.3. Real-time shadow mappingShadow mapping is an image-based shadowing technique. It is particularly amenable to hardware implementation because it makes use of hardware functionality- texturing and depth buffering existed. Here we present the implementation process of two real-time shadow mapping methods in detail: common GPU-based shadow mapping and hardware shadow mapping.Finally, I summarize my own research experience of programmable graphics pipeline and propose some potential research topics in the future.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2005年 02期
  • 【分类号】TP391.41
  • 【被引频次】41
  • 【下载频次】718
节点文献中: