节点文献

基于可重构计算的高可靠星载计算机体系结构研究

Researches on Architecture of Highly Reliable On-board Computing System Based on Reconfigurable Computing

【作者】 任小西

【导师】 李仁发;

【作者基本信息】 湖南大学 , 计算机应用技术, 2007, 博士

【摘要】 空间技术的发展对于全人类和世界各个国家具有极为重要的意义。它不仅是一个国家综合国力和尖端科技实力的体现,而且对国家的军事、国防和经济有着重要影响。近年来,世界上许多国家都先后启动了大规模的空间技术研究计划,包括火星探测、载人飞行和登月等。星载计算系统是计算机技术在空间环境下的应用,负责完成空间飞行器的控制和数据处理任务。由于空间环境的恶劣条件,从而对星载计算系统在性能、可靠性和成本上提出了巨大的挑战。在高昂的研究与制造费用、有限的硬件资源下,要确保海量数据处理的高可靠性是一项困难又关键的任务。设计一个高速、可靠并且在成本上可接受的星载数据处理系统对于宇宙科学探索及完成预定科学任务具有重大意义。本文以空间太阳望远镜(SST, Space Solar Telescope)项目为背景,研究支持海量数据处理、具有高可靠性和高性能的星载计算机体系结构,系统故障检测和修复方法及其可靠性理论分析。在研究分析可重构技术发展及其特点的基础上,指出高速、灵活的动态可重构技术能很好地满足星载计算系统在高可靠性和高性能两个方面的要求。提出了一种以LEON 2处理器核为基础的模块化动态可重构体系结构,以增强星载计算系统的数据处理能力,提高系统功能的灵活性。该体系结构以LEON 2处理器为核心,使用动态可重构模块实现海量数据处理,从而充分发挥可重构硬件资源所具有的并行特性,以改善星载计算系统的海量数据处理性能。动态重构模块与LEON 2处理器之间采用通用的总线接口进行通信,与专用接口相比有利于提高系统的灵活性。为了增强动态重构系统的可靠性,提出了一种改进的三模冗余(TMR)结构和基于重构配置数据的错误监测、故障检测和恢复方法,并利用JBits对硬件配置过程和配置数据细节进行封装、简化的特点,完成各种配置数据的操作,实现故障检测和恢复。同时,使用Markov过程理论对该结构和方法的可靠性进行了建模和分析。分析结果表明:在动态重构特性的支撑下,此故障检测和修复方法能显著改善系统的可靠性。研究了基于LEON 2的模块化动态重构系统实现,建立了一个原型系统。以快速傅立叶变换(FFT)为基准测试实例,将动态重构原型系统与80386、ADSP 21020等其它星载计算系统常用数据处理器的性能进行了比较。结果显示:在大量连续数据处理情况下,基于模块化动态可重构系统性能强于其它系统处理性能。最后给出了一个基于JBits/JRoute的故障修复模拟实验,通过借助JRoute的自动布线功能,当芯片上存在布线资源故障时,JRoute可以找到替换的路径,修复故障。

【Abstract】 The development of aerospace technology is very important to different nations and human beings in whole. It is a sign of the comprehensive national strength and the research and development capability in advanced science and technology. The aerospace technology has great impacts on military technology, national defence, economy, as well as many other related fields. Recently, many countries have launched various large-scale aerospace exploring projects, such as manned orbit flight, manned moon landing, trip to Mars and so on. OBCS (on-board computing system) is the application of computing technology in outer space environment, and it fulfils the tasks like aerocraft control, communication and on-board data processing. Due to the poor working conditions in outer space, it is a great challenge for OBCS to achieve good performances in speed, reliability and cost simultaneously. The high manufacture cost and limited budget has made it a demanding job to implement a fast and reliable OBCS which is vital to any space exploration mission.Space Solar Telescope (SST) is a scientific space research project, which will send a solar telescope into outer space to overcome the negative effects of Earth atmosphere. With SST project as the background, the paper aims to solve the key challenges in designing a reliable OBCS with high performance. Such challenges include system architecture, fault diagnostic and recovery methods. By analyzing the development and the characteristics of reconfigurable computing technology, the paper finds out that the dynamic reconfiguration technology can meet the requirements which OBCS imposes on the performance, reliability and cost. A modular architecture based on LEON 2 IP (Intelligence Property) core is proposed, which supports dynamic reconfiguration, and is able to boost system performance and flexibility. In this architecture, a dynamic reconfigurable module is used to process the huge data in order to exploit the advantage of parallelism nature of reconfigurable hardware. The dynamic reconfigurable module communicates with LEON 2 processor IP core via a generic co-processor interface, providing better flexibility than special interfaces. To enhance system reliability, an improved TMR (Triple Module Redundancy) scheme and a methodology to detect and remove faults are also provided. The fault detection and recovery methodology is based on the configuration bitstream (configuration data). To simplify the fault detection and recovery implementation, JBits is employed to deal with the configuration data and hardware reconfiguration operation. Markov process theory is used to model and evaluate the reliability of the system which can be repaired through hardware reconfiguration. The analysis results show that our approach of fault detection and recovery can improve the reliability of system. To facilitate the design and implementation of modular dynamic reconfigurable OBCS, the basic system design process and some implementation techniques are explored. Their feasibility and effectiveness are demonstrated by building a prototype system. By running an FFT benchmark test task, the performance of the reconfigurable prototype system is evaluated and compared with some systems of different architectures such as 80386 and ADSP 21020. The testing results indicate that: the performance of dynamic reconfigurable system is much higher than other competitors in case of processing large amount of data continuously. Simulation results on routes fault recovery using JBits/JRoute are also given, which proves that, JRoute can find a new route to replace the faulty one automatically after a route fault has newly occurred.

  • 【网络出版投稿人】 湖南大学
  • 【网络出版年期】2007年 04期
  • 【分类号】TP338;V446
  • 【被引频次】31
  • 【下载频次】1248
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络