节点文献

低偏斜、高能效的片上谐振时钟分布网络关键技术研究

The Key Technologies of On-chip Resonant Clock Distribution Network with Low Skew and High Energy Efficiency

【作者】 徐毅

【导师】 陈书明;

【作者基本信息】 国防科学技术大学 , 电子科学与技术, 2013, 博士

【摘要】 时钟分布网络在同步电路系统中扮演着极为重要的角色,其设计优劣不仅决定同步系统功能的正确性,影响系统性能的高低,更是整个系统功耗的主要组成部分之一。本文面向高性能同步系统时钟分布网络设计的两大关键问题——时间不确定性和低功耗设计,深入研究了片上谐振时钟这一新兴时钟分布技术,提出了一套基于无缓冲谐振技术的时钟分布机制并对支持该机制的相关理论及关键电路进行了深入研究。该时钟机制及相应的设计方法不仅能够最小化局部谐振时钟网络的功耗,提高时钟偏斜对寄生参数差异及PVT变化的鲁棒性,还能够满足大规模、高性能同步系统或者多时钟域系统的设计需求,为整个系统提供低功耗、低偏斜及抖动的时钟分布网络。本文的主要研究成果和创新点有:1.提出了一种面向局部无缓冲谐振时钟网络的功耗优化策略。针对无缓冲谐振时钟网络的功耗最小化问题,提出了一种启发式的优化算法,通过SPICE分析对关键设计变量进行折中,包括时钟负载、片上电感、时钟互连网络以及能量补偿单元等。采用功耗优化算法对不同规模的标准测试电路进行了设计和优化,模拟结果表明,该功耗优化策略能够快速收敛并有效降低无缓冲谐振时钟网络的功耗开销。2.提出具有低时钟偏斜、高变化容忍的层次化无缓冲谐振时钟分布网络结构——HBRCDN。针对局部无缓冲谐振时钟网络的偏斜优化和鲁棒性问题,提出层次化的分布结构,将H树型和网格型互连结构的各自优势有效结合:在采用H树结构平衡时钟互连路径的同时,兼顾网格型无缓冲网络多扇出并联通路的特点。模拟结果显示,HBRCDN不仅能够降低谐振时钟网络的偏斜,避免非平衡时钟负载对偏斜的影响,并且对PVT变化具有良好的鲁棒性。在TSMC65nm标准CMOS工艺下进行了流片验证。3.面向大规模同步系统的谐振时钟网络全局设计问题,提出了一种局部紧耦合的无缓冲谐振时钟分布结构:首先将同步系统划分为多个局部时钟域,各时钟域具有相近的目标频率,采用HBRCDN结构降低时钟偏斜,相邻网络之间则通过片上耦合网络实现注入锁定。基于耦合振荡器阵列理论系统地分析了谐振时钟系统的频率及电压特性,并针对一款开源微处理器核进行了设计与分析,模拟验证结果表明:紧耦合的谐振时钟分布结构不仅保留了无缓冲谐振时钟网络低功耗、低偏斜和低抖动的特点,而且易于锁定,能够为高性能同步系统提供稳定的时钟信号。4.提出了一种面向多时钟域系统的混合谐振时钟机制。该结构在全局分布网络中使用行波振荡器阵列产生方波型时钟信号,而在各时钟域内部,则采用单个HBRCDN或多个紧耦合的HBRCDN结构得到低偏斜及低抖动的谐振信号。全局时钟分布网络由一种改进的行波振荡器——PPTWO构成的阵列提供行波时钟信号,通过时钟偏斜调整电路和注入锁定电路对局部谐振网络的相位进行锁定。与采用H树结构、基于注入锁定的全局谐振时钟分布网络相比,混合谐振时钟机制不仅能够满足大规模系统对全局时钟分布的需要,而且能够最小化局部时钟分布网络的功耗。综上所述,本文系统研究了面向高性能数字系统的片上无缓冲谐振时钟技术,针对现有无缓冲谐振时钟技术存在的设计复杂度高、受寄生参数差异及PVT变化影响大,以及设计规模受限等问题,提出了一套基于无缓冲谐振时钟分布技术的电路结构和设计方法,并从理论和实践两个方面对该技术的正确性和有效性予以了充分论证。理论分析和电路仿真结果表明:该方法不仅能够有效降低时钟网络的功耗开销,还可以为整个系统提供高频率、低偏斜和低抖动的同步信号。本文的研究成果对于促进片上谐振时钟技术在高性能数字系统中的研究和应用具有一定的理论价值和实际的工程意义。

【Abstract】 Clock desitribution network plays an important role in the synchronous circuitsystem, which not only determines the correct function of synchronization system, butalso impacts the system performance, and contributes one of the main components ofpower consumption in the whole system.This dissertation focus on the two key issues of clock distribution network forhigh-performance synchronous system: timing uncertainty and low-power design. Viain-depth study of the emerging on-chip resonant clock technology, a clock distributionscheme based on bufferless resonant clocking is proposed, and the theory of themechanism and key circuits are researched in details. The present bufferless resonantscheme and corresponding design method can not only minimize the powerconsumption in local clock distribution, but also improve the robustness of clock skewover parasitic difference and PVT variations. Moreover, it can meet the requirements oflarge-scale, high-performance synchronous system or complex system with multipleclock domains, and provide a resonant clock distribution with low power dissipation,low skew and low jitter.The main research achievements and innovations described in this dissertation aresummarized as follows:1. A novel strategy to minimize power consumption in local bufferless resonantclocking network is proposed. Targeting at the problem of power minimization inbufferless resonant clocking distribution, a heuristic optimization algorithm is proposed,which trades-off the key design parameters through SPICE analysis, including the clockload, on-chip inductor, clock interconnection network, and energy compensating cell.The optimization algorithm is carried out on the standard benchmark circuits withdifferent sizes. Simulation results show that the power optimization strategy can quicklyconverge and effectively reduce the power consumption in the bufferless resonantclocking network.2. A novel hierarchical bufferless resonant clock distribution network--HBRCDNis proposed for low clock skew and high tolerance to variation. Aims at the skewoptimization and robustness in bufferless resonant CDN, a hierarchical structure ispresented, which combines the advantages of H-tree type and mesh-type together: beingwith the balanced path delay of H-tree while employing multi-fanout parallel paths ofmesh architecture. Simulation results show that, HBRCDN not only reduces the skew inresonant clock network and avoids the impact of unbalanced load, but also behavesgood robustness to the PVT variations. The proposed architecture is verified underTSMC65nm standard CMOS process technology. 3. Targeting at the global design problem for large scale synchronous system, anovel resonant clocking structure with local close coupled network is presented. Firstly,the whole synchronous system is divided into several local regions, which have nearlythe same target frequency and low clock skew by employing HBRCDN structure. Theadjacent clocking networks are injection-locked to each other by on-chip couplingnetwork. Based on the theory of coupled oscallitor array, the frequency and voltagequality are studied systematically. Design and analysis are carried out through anopensource microprocess core. Simulation results show that, the close coupled resonantdistribution structure not only retain the characteristics of low power, low skew andjitter in bufferless resonant clock network, but also be easy to lock, and can providestable clock signals for high-performance synchronous system.4. A novel hybrid resonant clock mechanism is proposed for multi-clock domainsystem. The structure uses traveling wave oscillator array for the global square-wavedistribution network, and makes use of single HBRCDN or multiple close coupledHBRCDNs in each clock domain for low skew and jitter resonant clock. An improvedtraveling wave oscillator--PPTWO is presented to form an array. The global travelingwave signals are then adjusted by clock skew compensation circuit and phase-locked inlocal network by injection locking circuits. Compared with the H-tree structur based oninjection locking global resonant clock distribution, the hybrid resonant clockmechanism not only satisfy the needs of increasing scale of the global clocking, but alsominimizes the power consumption in the local clock network.In summary, the on-chip bufferless resonant clocking techniques for high-perfor-mance digital systems are systematically studied in this dissertation. In order to resolvethe problems such as high designing complexity, being sensitive to parasitic differenceand PVT variations, and limited for large scale synchronous circuits, a serious of novelcircuit techniques and design methodologies are proposed. Moreover, the correctnessand efficiency of the presented techniques and methods are verified thoroughly bytheoretical derivation and experimental simulations. The theoretical and simulationresults show that the techiniques and methods can effectively reduce the powerconsumption in the clock network, and provide high frequency, low-skew and low-jittersynchronous signal for the entire system.The achievements presented in this dissertation have academic and practicalengineering value to promoting the research and application of on-chip resonant clocktechniques for high-performance digital systems.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络