节点文献

片上网络的建模仿真与性能优化研究

Research on NoC Modeling and Performance Optimization

【作者】 程爱莲

【导师】 严晓浪; 潘赟;

【作者基本信息】 浙江大学 , 电路与系统, 2012, 博士

【摘要】 得益于半导体工艺和VLSI设计技术的迅猛发展,嵌入式系统能够胜任越来越复杂和高性能需求的工作。然而工艺尺寸的持续减小和嵌入式应用对带宽需求的增加,使得传统总线式的片上通信架构面临着巨大的挑战。片上网络以其出色的扩展性、灵活性和传输并行性等特点,受到广泛的关注,成为替代传统总线的新兴通信架构。片上网络具有多样化的拓扑结构、复杂的通信协议和众多的配置参数,构成了巨大的设计空间。如何从中选择最适合的架构,是设计初期应解决的重要问题之一。本文提出了一个适用于大规模设计空间搜索和参数优化的解析模型,它以M/M/1/K排队系统为基础,可应用于任意拓扑结构,不仅计算了平均延迟和吞吐量等基本性能信息,还可以分析各路由器中等待延迟的空间分布状况,能够快速定位拥塞区域。该模型的创新点在于提出了一种基于路径分解的性能评估方法,它利用路由信息对共享链路进行拆解和分类,分析和量化了虫孔交换中链路相关性对传输延迟的影响,提高了建模精度。解析模型基于统计方法和数学公式来分析网络行为,抽象层次高,运行速度快,在处理静态的、可量化的系统参数方面具有较好的效果。但是,对于动态可配置的架构、复杂的流控策略以及难以量化的因素,解析模型则较难处理,还需要借助于精确度更高的仿真工具。本文设计了一款时钟精确的片上网络仿真器。它细粒度的模拟了路由器的流水线结构和控制逻辑,不仅支持多虚通道的管理和分配、虫孔交换方式、基于信用量的流控机制等通信技术,在网络规模、流量模式以及缓存大小等参数配置方面也具有很好的灵活性。通过对不同参数的片上网络进行仿真分析,可以看出拓扑结构、虚通道数目和缓存深度等因素对通信性能的影响,从而帮助设计者优化互联架构和改善通信质量。在对片上网络进行合理建模和精确模拟的基础上,本文提出了一种基于非均匀带宽异构网络的吞吐量优化方法。通过分析影响网络吞吐量的关键因素,提出了基于通信量和基于通道利用率两种不同的通信容量规划策略,在有限的连线资源下优化通道带宽,提高通信性能。基于通信量的带宽分配方案只考虑了流量负载在不同链路上的分布比例。基于通道利用率的方案使用了前面建立的NoC解析模型,能够兼顾通信量、链路依赖关系和资源利用率等多种因素,在预测拥塞位置方面表现更好。为了实现不同位宽链路之间的数据匹配和交换,本文还设计了一款多端口路由器,它采用非全连通的交叉开关以降低硬件开销。最后,仿真模型验证了非均匀带宽异构网络在吞吐量改善方面的效果。

【Abstract】 Embedded systems benefit from fast development of semiconductor technology and VLSI design methodologies, and they improve greatly and steadily in performance to meet the needs of complex and computationally intensive applications. However, the fact that the device feature size is continuously shrinking and the bandwidth requirements are increasing, challenges the traditional bus-based communication architectures. Networks on Chip (NoCs) have emerged as a promising alternative because of their excellent scalability, flexibility and transaction-level parallelism.NoCs have diverse topologies, complicated communication protocols and various configuration parameters. How to choose an optimal architecture from the enormous design space is a common and important problem in the early design stages. We present a general analytical model for large space explorations and design optimizations. It is developed based on M/M/1/K queuing system and not limited to certain topologies. It can provide useful performance information, including average latency, throughput and waiting time distribution in routers. Besides, we propose the routing path decomposition approach to analyze and quantify the influence of link dependencies on latency. It resolves the inherent dependency of successive links occupied by one packet in wormhole routing, and improves evaluation accuracy.Analytical tools model network behaviors based on statistical methods and mathematical theories. They feature high-level modeling and fast evaluation speed, and are helpful in analyzing static and quantitative parameters of well-defined systems. On the other hand, accurate simulation tools outperform in measuring dynamic and qualitative factors of complex systems. We present a cycle-accurate NoC simulator, which emulates the pipeline stages and control logic in routers at a flit level. It provides good supports for various communication techniques, such as virtual channel flow control, wormhole routing, credit-based flow control, etc. The simulator is flexible in configurations and can evaluate assorted topologies, traffic patterns and router architectures. Effects of these parameters on communication performce can be revealed intuitively through simulating NoCs with different configurations. As so. the simulator is a useful tool for designers in optimizing interconnect architectures and improving communition quality.Equipped with the analytical model and the accurate simulator, we design a throughput optimization approach based on non-uniform link capacity allocation. We first analyze the key factors which influence NoC throughput, and propose two methods of channel bandwidth planning, one is based on traffic volume, and the other is based on channel utilization derived by the analytical model. Allocation based on traffic volume only considers workload distribution, while the solution based on channel utilization takes resource utilization and link dependency into account as well as the workload. Therefore, the latter performs better in locating congestion. We present a multi-port router to connect links of different bandwidth. It employs limited-connected crossbar to reduce hardware cost. Improvement on throughput of our proposed heterogeneous NoC is validated by simulation results.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2012年 08期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络