节点文献

多核SoC片上网络关键技术研究

Key Techniques of Network-on-Chip Design for Multi-Core System-on-Chip

【作者】 刘祥远

【导师】 陈书明;

【作者基本信息】 国防科学技术大学 , 微电子学与固体电子学, 2007, 博士

【摘要】 在技术发展以及需求增长的驱动下,未来SoC的集成度和复杂性将继续增大,单个芯片上将集成数百个IP核,包括RISC核、DSP核以及存储单元核等。在这种规模的多核SoC设计中,各个IP核之间如何进行通信成为一个关键问题。因此,未来多核SoC需要采用性能高、功耗低、可扩展性强的片上通信系统,这已经成为近年来研究的热点。由于传统的片上通信结构(如总线)已经无法适应当前的设计需求,以通信为中心的片上网络(NoC)技术为多核SoC通信问题提供了新的解决方案。本论文对多核SoC通信网络的设计与优化进行了深入研究,根据目标系统——多核DSP的通信特点,深入分析了NoC设计的关键技术,研究了在满足高性能通信需求的前提下优化功耗与面积开销的NoC设计方法。论文的主要工作与创新点包括:1)在NoC的物理层,针对互连实现问题,提出了基于中继驱动器以及低摆幅传输电路的混合插入方法:HI(Hybrid Insertion,混合插入),克服了现有中继驱动器插入方法以及低摆幅传输方法不能兼顾性能与功耗的缺点。导出了采用HI方法时在互连线上插入中继驱动器和低摆幅传输电路的最优个数以及位置,并给出了证明。实验结果表明,HI方法能有效减小全局互连的延迟、功耗以及面积开销。2)在HI方法的基础上,提出了基于三维查找表的低摆幅互连估算模型:LSIEM(Low Swing Interconnect Estimation Models,低摆幅互连估算模型),解决了低摆幅互连高层估算模型匮乏、不通用的问题。首先给出了LSIEM模型的算法框架,能在设计初期快速估算长线互连的性能、功耗以及面积开销。利用LSIEM模型进一步提出了OWS-HI(Optimal Wire Sizing for Hybrid Insertion,互连线尺寸优化的混合插入方法)方法,能在采用HI方法互连的同时优化连线的尺寸设置。实验结果表明,与HSPICE模拟相比,LSIEM模型的延时与功耗估算精度均超过90%,并且计算速度提高了95倍。3)在NoC的物理层,针对全局同步问题,提出了基于加权Gray码指针以及实时状态检测机制的异步FIFO结构:WG-FIFO(Weighted-Gray code FIFO,加权Gray码FIFO),克服了现有异步FIFO设计保守、低效以及浪费空间的缺点。给出了WG-FIFO的总体结构设计,分析了指针编码方式及状态检测机制的正确性和有效性。实验结果表明,与已有异步FIFO相比,WG-FIFO在FIFO深度为4~16的情况下具有更高的读/写性能以及操作效率,并能减小面积开销。4)在NoC的网络层及网络适配层,根据目标系统——多核DSP的通信特点提出一种基于集群思想的层次化NoC架构:LSGT-NoC(Locally Star Globally TorusNetwork-on-Chip,局部星型全局环网结构的片上网络),解决了现有NoC设计不能很好兼顾性能与功耗的问题。提出了层次化的LSGT(Locally Star Globally Torus,局部星型全局环网)拓扑结构,设计了支持批量传输以及集群内组播的传输协议,实现了路由节点、网络接口以及全局链路等NoC基本部件,并特别针对Crossbar和全局链路进行了低功耗优化。实验结果表明,LSGT-NoC架构具有跳步数少、带宽高、功耗低和可扩展性强等优点。论文的研究成果为多核SoC的通信问题提供了一个可行的方案,为进一步提高多核SoC的并行性以及实际运行性能提供了理论和实践基础。

【Abstract】 With the development of technology and the increase of application requirements, the complexity and capacity of SoCs will continue growing in the future. A single chip in the next decade may contain hundreds of IP cores, including RISC cores, DSP cores and storage elements. In the multi-core SoC designs at such scale, one key issue is how the various IP cores communicate with each other. Therefore, on-chip communication systems with high performance, low power and good scalability for future multi-core SoCs have become a hot research field in recent years.Traditional on-chip communication structures, such as buses, can’t satisfy the requirements of current multi-core SoC designs. Network-on-Chip (NoC), a communication-centric technique, provides a new solution for communication problem of multi-core SoCs. This dissertation focuses on the design and optimization of NoC in multi-core SoCs. According to the communication features of one target system, multi-core DSP, the dissertation analyses the key techniques of NoC design, and makes a study of the design methods on reducing power dissipation and area overhead on the premise of satisfying the needs of high-performance communications. The main works and contributions of the dissertation are as follows:1) In the physical layer of the NoC communication stack, a hybrid insertion strategy based on full-swing repeaters and low-swing transceivers, HI (Hybrid Insertion) strategy, is proposed for global interconnect implementation. The HI strategy can be used to balance interconnect delay and power dissipation well, which is the main shortcomings of current repeater insertion schemes and low-swing transmission schemes. The optimal parameters, including the number and the insertion interval of repeaters and low-swing transceivers required for the HI strategy, are derived and proved in the dissertation. It is shown that the HI strategy can reduce interconnect delay, power dissipation and area cost effectively.2) For lack of general high-level estimation models of low-swing interconnects, a three-dimension lookup table based estimation model, LSIEM (Low Swing Interconnect Estimation Models), is presented on the base of the HI strategy. Firstly, the overall framework of LSIEM model is given in the dissertation, and corresponding model can be used to estimate the delay, power and area cost of long wires rapidly during early design stages. Moreover, an OWS-HI (Optimal Wire Sizing for Hybrid Insertion) scheme is proposed by using the LSIEM model. The scheme can be used to optimize the wire size of the HI interconnects. Experiment results show that, compared with HSPICE simulation, the LSIEM model (delay estimation model and power estimation model) has the accuracy of more than 90% and increases computation speed 95 times.3) Also in the physical layer of the NoC communication stack, a new asynchronous FIFO structure, WG-FIFO (Weighted-Gray code FIFO), is proposed for global synchronization. The WG-FIFO encodes write/read pointer with a new Weighted-Gray code, and controls write/read operations with real-time global states detectors. Therefore, the proposed FIFO can overcome some shortcomings of existing asynchronous FIFO designs, such as conservativeness, inefficiency and waste of space. In the dissertation, the overall structure of the WG-FIFO is described in detail, and the validity of pointer coding mode and state detection mechanism is analyzed in depth. Simulation results show that all of throughput, area cost, and write/read operation efficiency for the proposed FIFO can be effectively improved in the depth range of 4~16, compared with other available FIFO designs.4) In the network layer and network adaptor layer of the NoC communication stack, a clustered hierarchical NoC design, LSGT-NoC (Locally Star Globally Torus Network-on-Chip), is proposed according to the communication features of the target multi-core DSP. The LSGT-NoC design can make a good balance between performance and power dissipation. In the dissertation, the hierarchical LSGT (Locally Star Globally Torus) topology and the transport protocol supporting multicast transmission are introduced. And the essential components of NoC, such as switch, network interface and global link, are implemented. Furthermore, the low-power optimization methods of crossbar and global link are explored especially. Simulation results show that, LSGT-NoC has the advantages of small hop numbers, high bandwidth, low power, and good scalability.The research in the dissertation provides a feasible solution for the communication problem in multi-core SoCs, and the results can be used to further improve practical parallelism and performance of multi-core SoCs.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络