节点文献

64位高性能浮点乘法器的设计优化

Optimization of A 64-bit High Performance Float-point Multiplier

【作者】 李晓静

【导师】 李少青;

【作者基本信息】 国防科学技术大学 , 软件工程, 2010, 硕士

【摘要】 浮点乘法器结构复杂,逻辑计算延时较大,是影响高性能微处理器设计的瓶颈之一。更快更好的实现浮点乘法的逻辑计算,对提高处理器性能具有重要的意义。半定制实现方式已经满足不了越来越高的主频要求,为了达到设计目标,在考虑性能和工作量基础上,本文采用核心模块——部分积压缩和部分积累加全定制设计,总体采用半定制方法实现浮点乘法器,在不过多增加开销的情况下,能够有效提高浮点乘法器的速度。本文的研究成果主要有:1.提出了一种改进的实现4-2压缩器的结构,用于本文的压缩结构,与以前的结构相比延时减少了大约27.5%;2.全定制设计了4-2压缩器,其延时为0.11ns,与半定制实现的4-2压缩器延时0.18ns相比,延时减少了39%;3.在分析并行加法器的组加法器位数与进位树产生延时的关系的基础上,采用136位全并行的设计方法全定制实现了该加法器,其延时为0.30ns,使部分积累加模块总延时减少了21.3%。优化后的浮点乘法器在65nmCMOS工艺的典型(tt)情况下,性能由1.4GHz优化到1.8GHz,提高了大约30%。对浮点乘法器进行了后端物理设计,版图实现后为1.36GHz。

【Abstract】 The performance of float-point multiplier is the bottleneck for the high performance microprocessor, because the architecture of float-point multiplier is very complex, and its latency of the circuit-implement is especially long. Optimize the speed of the implementation of float-point multiplier is very importment for the improvement of microprocessor. Semi-custom design can’t satisfy the more and more high frequency. In order to get to the target, partial product compression and accumulation is designed by full-custom. Optimizing the float-point multiplier by the method of combination of full-custom and semi-custom is effective.The fruit of studying is that:1. A novel 4-2 compressor is proposed in this paper is used in the compression, the latency is less 27.5 percentage than original 4-2 compressor;2. The latency of the 4-2 compressor designed by full-custom is 0.11ns, and the latency of the 4-2 compressor designed by semi-custom is 0.18ns. The latency of the 4-2 compressor designed by full-custom is less 39 percentage than the 4-2 compressor designed by semi-custom;3. Analyze the related the number of bit of team adder with latency of carry tree to give the method of implementing high speed 136-bit adder by all parallel no matter the Sum or Carry. And designed by full-custom, the latency is 0.30ns, making the latency of the partial product accumulation be less 21.3 percentage than that semi-custom.The synthesized frequency of optimized float-point multiplier is 1.8GHz based on 65nm technology, which increase 30 percentage than 1.4GHz designed by semi-custom. Physical design the float-point multiplier, and after placing and routing the frequency is 1.36GHz.

【关键词】 浮点乘法器提速半定制全定制LEFLIB
【Key words】 float-point multiplierspeed optimizationsemi-customfull-customLEFLIB
节点文献中: