节点文献
低延迟无缓存传输与控制分离的片上网络拓扑结构研究
The Topology Study of Low-Latency and Unbuffered On-Chip Network with Separation of Transmission and Control
【作者】 刘浩;
【导师】 邹雪城;
【作者基本信息】 华中科技大学 , 半导体芯片系统设计与工艺, 2009, 博士
【摘要】 先进微电子与半导体工艺技术极大地提升了晶体管翻转速度,降低了门延时,但是芯片中全局线时延导致的信号传输与功耗问题却变得越来越糟糕。由于当前的微处理器体系结构决定着芯片中导线必须连接芯片上的每一个功能单元,因而,互连线延时成为阻碍芯片性能提升的关键因素。同时,单一芯片上可集成的处理器核和各种功能模块的数量持续增加,现有的微处理器体系结构不再具备解决片上多处理器芯片系统所需的能力。基于互连流水思想的片上网络(Network on-Chip,NoC)的体系结构是一种理想的解决方案,它将不可预测全局线时延转变可控的事件延迟。基于片上网络的系统将以计算为中心的系统设计方法转变到以通信为中心的系统设计方法上来,为片上多处理器核系统提供高性能的通信与数据传输。网格型结构、树形结构、多环形结构等是当前应用和研究较多几种片上网络拓扑结构。作为一个前沿性的研究领域,片上网络在它的拓扑结构、应用开发、系统平台及其开发工具设计等方面还有许多的课题值得深入地开展研究,比如EDA设计工具、操作系统、实现成本、网络延迟、缓存策略、网络拥塞、死锁,以及网络热点等问题。且现有的片上网络体系结构中,交换节点是最基本组成单元,它承担着片上网络除物理层之外的全部功能,所以它的功能繁琐、结构复杂、实现成本较高,数据传输在节点内的延迟也较长。这种结构不仅阻碍了片上网络性能的提升,且大量的内置缓存也大大增加芯片的实现成本。基于上述分析,本文提出一种新型的片上网络拓扑结构:传输与控制分离的片上网络S-mesh。基于S-mesh系统结构中,其片上通信网络采用电路交换方式,而处于网络外部边缘设备如资源节点则采用报文分组交换方式。S-mesh片上网络结构包含两个子网:基于2D-mesh的数据传输网络和基于Butterfly的控制网络。S-mesh网络结构与其他几种网络拓扑的区别主要有两点。第一:S-mesh网络的交换节点不再承担传输层功能,网络层功能。第二:设置网络管理单元,实现系统的资源管理、路由决策以及流量控制等功能。本文的研究工作主要体现在网络的拓扑结构、无缓存的交换节点结构、网络的流量控制、以及系统的路由算法等几个方面。在网络的拓扑结构方面,采用传输与控制分离的体系结构以精简交换节点功能,构建低延迟的片上通信网络。其次,采用无缓存的交换节点结构来减少缓存容量、降低芯片实现成本,使报文在交换节点传输延迟缩减为一个时钟周期。基于最短路由和目标驱动的S-mesh路由算法为网络中传输进程确定最优路由路径,并与系统三级流量控制策略一起,为S-mesh网络拥塞和死锁等现象的解决奠定了基础。最后采用旁路网络的BS-mesh网络结构来优化网络中相邻节点的通信性能。结果表明,S-mesh网络具备很强的处理能力,其控制网络的峰值处理能力可达2425 MIPS;传输与控制分离的体系结构对网络资源统一管理,避免了网络拥塞和死锁现象的产生;而无缓存的交换节点结构使芯片具有更低实现成本,更高的对分带宽,单交换节点最高可提供23.5GB/s数据传输带宽,在4×4的S-mesh结构中,对分带宽最高为37.64GB/s,其芯片实现面积具有明显的优势,处于领先水平(0.0186mm~2),且具有更快的运行速度。传输与控制分离的S-mesh片上网络具有低延迟、高性能、低成本特点。S-mesh网络结构适合在中等及其以上的网络规模中使用,对报文长度较长的业务更具优势,BS-mesh网络结构则适合于数据传输具有明显区域性的应用场合。
【Abstract】 The advanced technologies have improved transistors switching rate, and reduced transistor gate delay. However, signal transmission and power consumption are getting worse due with the global wire delay. And furthermore, currently microprocessor architecture determines the chip wires should be connected to every memory and function logic unit. The widening gap between the relative of gate speed and global wire delay will have a serious impact on microarchitecture performances. Facing the increasing of the number of embeddable microprocessors and special logic modules, the current microprocessor architecture might own insufficient ability to deal with demands for the multi-core chip systems. An ideal alternative architecture for these challenges and demands is Network on-Chip (NoC) based on pipeline method. It could convert the unpredictability of the global wire delay into the predictable event latency. The system architecture based on NoC that changes system method from computation-centered to communication-centered. And what its key goal of on-chip network is to construct a high-performance on-chip communication network with low-latency, and scalability for multi-core chip system.At present, the major NoC topologies include the mesh structure, tree structure, and multi-ring structure. As a new type of research fields on the cutting edge of science, there are many interesting research topics in NoC fields, such as system architecture, service applications, system design platforms etc. However, NoC architectures have inherited some weak points from the computer communication network, e.g. EDA tools, operation system, cost, latency, buffer strategy, network congestion, deadlock, and network hot spots and other issues. Due to the fact that the switch node as the primitive element in the current NoC architectures should be needed to finish all missions except physical layer functions.In brief, we present a new type of NoC topology named as S-mesh. It is an on-chip network with the separation of control and transmission. In the S-mesh system, the kernel communication network adopts circuit switching mode, and the edge devices, such as resource nodes, adopt the packet switching mode. The S-mesh network architecture consists of two types of sub-networks: mesh-based data transmission network and butterfly-based control networks. There are two unique characteristics. One is that functions of switch nodes only undertake link layer functions and physical layer functions. Another is that the new control units added would be responsible for the system resource management, routing decisions, and flow control. The main study of this thesis mainly embody in several aspects, such as network topology, unbuffered switch microarchitecture, network flow control mechanism , and system routing algorithm, etc.In the first place, the S-mesh architecture designed as separation of transmission and control to optimize the switch node’s functions, and to construct lower-latency on-chip communication network. Secondly, the unbuffered switch node architecture can effectively reduce the chip cost. And the packets latency in each switch node would be reduced to one clock cycle. Thirdly, the routing algorithm of S-mesh based on the shortest routing algorithm destination-oriented routing algorithm is designed to make optimum transmission paths. It works with three-level flow control strategy to make the system immunizing for network congestion and deadlock. In the last place, BS-mesh network architecture based on bypass network optimizes the adjacent nodes communication performance in the S-mesh architecture.The results shown that control network in S-mesh has strong ability to handle numerous transmission processes. Its peak performance is approximately 2425 MIPS. The S-mesh architecture has immunities on network congestion and deadlock. The unbuffered architecture of switch node can reduce system cost. Meanwhile its bandwidth can achieve approximate 23.5GB/s. The bisection bandwidth of the 4×4 S-mesh network is up to 37.64GB/s. It has obvious advantages on switch area (0.0186mm~2 ) and higher operating speed.The S-mesh network architecture possesses a few characteristics of low-latency, high-performance, and low-cost. It is suitable for medium-scale and large-scale on-chip network, especially for services with long packet length. Meanwhile, the BS-mesh network architecture is suitable for data transmission which is locality effect.