节点文献

600MHz YHFT-DX乘法部件的设计与验证

The Design and Verification of Multiply Unit of 600MHz YHFT-DX

【作者】 辜选琼

【导师】 李少青;

【作者基本信息】 国防科学技术大学 , 软件工程, 2010, 硕士

【摘要】 YHFT-DX是一款32位超长指令字结构的高性能定点DSP,CPU内核设置了两个独立的乘法部件,两个乘法部件功能和结构完全相同,并且都是流水实现,使得YHFT-DX具有很高的乘法性能,但其涉及的指令数量和种类较多,使得乘法部件内部结构比较复杂,这为600MHz的设计目标提出了挑战。本文根据YHFT-DX处理器的设计要求,在全定制与半定制混合设计方法的基础上,从系统级、模块级和电路级等方面对设计中影响时序、面积等关键因素进行了深入研究,最后完成了乘法部件的设计,达到600MHz的设计目标。本文的主要内容体现在以下几个方面:1.在深入分析乘法部件的功能及流水线结构的基础上,通过站间逻辑归并、同一化处理、逻辑前移等技术对同类流水线结构进行优化,不同类流水线结构之间共用站间寄存器,实现分时复用,节约硬件资源。2.全定制实现关键模块的设计。在设计过程中,采用分级分站、减少操作位数、逻辑分割、重组或转换技术对关键模块的结构进行优化;电路设计中除了采用常用的电路结构外,另设计大驱动能力的寄存器,以减少逻辑级数;版图设计时充分采用位片设计方法,源/漏共享,通道复用等多种设计技术减少长线互连和寄生参数。上述三个层次的优化确保了全定制模块的时序满足设计要求。3.完成包含全定制模块的逻辑综合与物理设计。根据全芯片布局完成乘法部件的布局规划、电源地规划以及时钟设计,并在时钟设计中引入“有用偏差”来平衡内部时序违例路径。整体设计采用130nm的CMOS工艺,完成后的面积为400×430μm~2。验证结果表明设计功能正确,且最长路径的延时为1.31ns,相比整体采用半定制设计方法的时序改进了37.5%,达到设计目标。

【Abstract】 YHFT-DX is a32-bit fixed-point high performance DSP based on VLIW architecture. In YHFT-DX, there are two Multiply Units and both are pipelined, which make YHFT-DX has high multiplication performance. It’s a challenge to achieve the design goal of 600MHz for the Multiply Unit because of the numbers and abundant types of instructions, which made the internal structure complex.According to the design requirements of the YHFT-DX chip, this paper analyzes the critical factors which affects the timing and area from system level, module level and circuit level, and then implements Multiply Unit based on the mixed methodology. of Full Custom and Semi-custom. Finally the frequency of Multiplier Unit achieved 600MHz. The main contributions are as follows:1. Analyzing the function and pipeline of Multiply Unit, and three pipelines are adjusted and optimized by logic merging of different stage, the same treatment, logical move forward techniques and sharing registers.2. Implementing the design of critical modules based Full Custom methodology and optimizing the design with structural level, circuit level and layout level. In the design process, hierarchical sub-stations, reducing median operation,logical partitioning, reorganization or conversion technology to optimize the structure of key modules. In the circuit design besides uses the commonly used circuit structure, in addition designs a high driving capability register , reduces the logical progression. The layout of critical module are implemented and optimized, several layout methods such as slice-bit, source-drain share or route channel multiplexing and so on were introduced to reduces the long-line interconnection and the parasitic parameters. The above three level’s optimization had guaranteed all custom-made module succession to satisfy the design requirements.3. Implementing logic synthesis and physical design.Completes Multiply Unit’s floorplan, powerplan and the clock design according to the entire chip, and introduced the“useful skew”in the clock design to be balanced the internal timing for violation path.The design uses 130nm CMOS process, and the total area is 400×430μm~2. The verification result indicated that delay of critical path is 1.31ns. Compared with the result of ASIC, the delay of the longest path reduced about 37.5%, received a very large improvement in performance.

【关键词】 DSPVLIW600MHz乘法部件全定制SIMD
【Key words】 DSPVILW600MHzMultiply UnitFull-CustomSIMD
节点文献中: 

本文链接的文献网络图示:

本文的引文网络