节点文献

基于H.261协议的视频会议系统终端的研究与实现

The Research and Implementation of AH.261--Based Video Conference Terminal System

【作者】 王永会

【导师】 赵永哲;

【作者基本信息】 吉林大学 , 计算机软件与理论, 2004, 硕士

【摘要】 信息社会的发展、网络及计算机技术的进一步普及,使得对网络上提供高质量的视频/音频等多媒体服务的需求越来越大。视频会议技术就是一种让身处异地的人们通过某种传输介质实现“实时、可视、交互”的多媒体通讯技术。视频会议技术的应用可以为用户节省时间、提高工作效率,其应用领域非常广泛,有非常良好的发展前景。在视频会议技术的发展趋势中,重要的两点就是使用的协议类型从H.320向H.323转化和编解码方式由硬件向软件转化。为了对视频会议技术有一个深入的了解并为作者以后的相关研究奠定基础,从视频会议的发展趋势出发,在详细分析和比较了H.323中的各种协议后,选择了H.261协议、G.723.1协议和RTP协议进行了深入的研究和探讨并以这三个协议为基础完成了视频会议系统终端的纯软件实现。视频会议系统终端主要由四个类组成:主窗体类、H261类、G7231类和RTP类。由H261类和G7231类来实现对本终端的音视频数据的压缩及对其它终端的音视频数据的解压缩,由主窗体类向H261类和G7231类提出编解码请求并获得结果,由主窗体类和RTP类的通讯来实现本终端与其它终端和服务器的通讯。正文的主要内容就是结合源代码针对实现该系统终端时的难点和重点及其解决方法进行的详细阐述。在实现了视频会议系统终端的基础上,针对视频解码系统中常用的对输入码流进行译码的方法-Huffman译码树方法解码速度慢的缺点,提出了改进方法,即二次解码方法。该方法的基本思想可以应用于所有的变长码表的译码工作中。二次解码方法是在一次解码方法的基础上对构造难度、空间大小和执行效率的折衷。一次解码方法的主要思想是根据变长码表中的码字的最大长度设计一个结构数组,用码字作为地址,用该码字所代表的游程和值(run,level)作为结构数组中结构的成员。每次从输入码流中取x比特,将该x比特的值作为数组的地址,从该地址中取出数据,即完成了一个码字的解码。使用一次解码方法的一个限制条件是:考虑到设计结构数组的难度及所需要的空间,该方法适用于最大码字长度小于等于10的变长码表。该方法的另一个限制条件是:变长码表中的码字利用率要高(即大多数游程和值都唯一对应该结构数组中的地址且游程和值(run,level)的个数接近于结构数组中的结构数)。但H.261标准中的表5(VLC Table for TCOEFF)中码字长度大(14位),码字利用率较低(要设计的结构数组的结构数是该表中游程和值的个数的128倍)。因此并不适用于一次解码方法。 <WP=89>二次解码方法就是通过执行两次一次解码方法来解决一次解码方法所面临的上述问题。二次解码方法的工作流程是:首先从输入码流中取x位,将x位所对应的值作为地址,从事先设计的结构数组A中取出该地址对应的结构sa。根据sa的标志位成员的值可知首次解码是否成功。假设sa的代表剩余位数的成员值为r,r是x与码字实际长度的差。若标志位为1,表示首次解码成功,指向输入码流头部的指针回移r位,再取出游程和值(run,level)用于后继处理。若标志位为0,表示码字长度大于x,则从输入码流中再取r位进行第二次解码,从结构数组B中即可获得码字长度为x+r的码字所代表的游程和值。由于变长码表中码字的长度是依据游程和值(run,level)出现的机率设计的,即出现的机率越大,码字越短,因此首次解码成功的机率也是很大的,但由于存在第二次解码,所以效率是下降了。因此二次解码方法是对构造体设计难度和译码效率的折中。通过对上述两种译码方法的比较分析,二次解码方法在运行时间上的优越性得到了验证,即二次解码方法通过付出更多的空间代价,将平均解码时间缩短约一半,提高了译码速度。由于视频解码对实时性要求高,因此用空间换取时间是值得的。综上所述,本文用软件实现了主要基于H.261协议的视频会议系统终端,同时提出了一个针对视频输入码流进行译码的新方法-二次解码方法,将其应用于视频解码中,获得了良好效果。由于时间不足,系统终端还很不完善,作者将会在后续工作中对其进行改进和完善。

【Abstract】 With the development of information technology and the popularity of network, the need for supplying high quality video/audio multimedia services grows rapidly. Videoconference is a multimedia communication tech that can achieve the real-time and interactive communication with people who are in deferent places. Apply of Videoconference tech can save customer’s time and improve the work efficiency etc. Videoconference tech has a wide applicable scope, and it has a good prospect. During the evolution of videoconference tech, two important trends are: the protocol used is transitioned from H.320 to H.323 and the method of codec is changed from hardware codec to software codec.To acquire an in-depth knowledge of videoconference tech and establish a solid basis for latter correlation research, according to the trend of videoconference development, the author of this article selects H.261 protocol, G.723.1 protocol and RTP protocol for in-depth research and discussion, and then based on these protocols this paper illuminates the realization of videoconference terminal system by software implementation. Videoconference terminal system is mainly composed of four classes: main form class, H261 class, G7231 class and RTP class. H261 and G7231 classes are responsible to video/audio’s encoding and decoding. Main form class sends codec request to H261 and G7231 classes and get result from these classes. Through the communication between main form class and RTP class, main form class gets information of videoconference server and information of other terminals. On the basis of implementation of videoconference terminal system, this article puts forward a new method, namely decode-in-two-phases, to improve the speed of bit stream decoding. The basic concept of this method can be used in all decoding systems for decoding vlc table.Decode-in-two-phases method is a tradeoff of the difficulty and the efficiency on the basis of decode-in-one-phase method. The main concept of the decode-in-one-phase method is according to the max length of the code in vlc table designs a structure array, using the bit value of code as the address of the array element that is corresponding to the code and using the run and level represented by the code as the corresponding array element’s members. When got x bits from bit stream, it uses the bit value of x bits as the address to get the run and level from structure array.There are two conditions to use decode-in-one-phase method. One is the <WP=91>restriction for the max length of code. It is suitable for vlc tables in which the max length of code is smaller than 11 with a view to difficulty of designing the structure array and the needed space for the structure array. Another is the use rate of the code in vlc table. This rate referes to the number of run and level closes to the number of structure array’s element. In the Table 5 (VLC Table for TCOEFF) of H.261 protocol, the max length of code is 14 and the use rate of the code is very low. So decode-in-one-phase method is not suitable for Table 5 of H.261 protocol.Decode-in-two-phases method is a method via executing decode-in-one- phase two times to break the limitation of the decode-in-one-phase method. The working process of decode-in-two-phases method is: it first gets x bit from bit stream, then uses the bit value of x bit as address to get the corresponding array element (for example: sa). According to the value of flag (a member of sa) we can know the first time decode is successful or not. We can also get the value of r(a member of sa) which represents the difference of x and the real length of code . If value of flag is 1, that means the first time decode is successful, then the pointer pointing to the bit stream header move back r bit, then getting the run and level from sa for afterward handling. If value of flag is zero, that means the length of code is x plus r, so we get r bit again from bit stream for decode again. The run and level can be gotten from another structure array. Moreover, the result of x adding r is seen a

  • 【网络出版投稿人】 吉林大学
  • 【网络出版年期】2004年 04期
  • 【分类号】TN948.63
  • 【下载频次】165
节点文献中: 

本文链接的文献网络图示:

本文的引文网络