节点文献
数字音视频码流的分割及合并技术研究
Research on the Digital Video/audio Splitting and Merging Technology
【作者】 翁超;
【导师】 王兴东;
【作者基本信息】 上海交通大学 , 信号与信息处理, 2010, 硕士
【摘要】 随着数字音视频压缩技术的发展及各类多媒体业务的升级,音视频码流的分割/合并技术的应用也将逐渐广泛。本文主要从素材编辑、集群转码两类应用环境入手,分别对音视频码流的分割/合并技术进行研究。素材编辑环境下的音视频分割/合并侧重于针对具有完整组织结构的音视频素材实现非线性编辑“剪”和“接”的操作。本文针对现今高清非编制作的主流格式P2系列音视频素材,先后讨论了高码率MXF格式及低码率MP4格式素材的分割、合并技术实现。对于采用帧内压缩方式的DV、AVCI两类高码率MXF素材文件而言,难点在于对原素材元数据的解析、保留以及对较大素材文件实现的高效性,文中详细介绍了对此类文件元数据解析、音视频数据定位的流程,提出了多线程的重写方案,实验并确定了合适的重写数据块大小,有效缩短任务耗时;对于采用了帧间压缩方式的低码率MP4文件,文中具体针对低延时模式及含有双向预测帧的情况提出了基于帧变换的分割方案,达到了帧精度,与全解/分割/再次编码的传统方案相比有以下优点:由于仅在分割点附近的相应帧做帧类型变换,不需做全范围的解码编码,有效缩短了任务时间;避免了由全解/分割/再次编码方案造成视频图像降质的不足。集群环境下的音视频分割/合并侧重于提出多粒度的分割方案以及平滑的子片段合并算法及方案,使集群转码系统能够有效的整合计算资源,完成转码任务。本文结合集群转码系统业务流程的特点,分析了由转码管理服务器端对音视频做物理分割方案的不足,提出了基于打点的准分割方案,并针对常用的MPEG-2传送流格式具体讨论了如何对素材进行解复用打点以及任务拆分,确定了基于GOP的分割策略。随后着重讨论了如何对素材片段进行合并复用,保证音视频的重同步。最后在含有7个计算节点的集群转码系统环境下着重就分割粒度对转码性能的影响进行了实验,提出了合适的素材分割粒度。
【Abstract】 The splitting and merging technology for video material has been more and more widely applied with the development of digital video/audio compression and the enhancement of various multimedia services. This dissertation will focus on the algorithms and implementations of splitting and merging for video/audio from two specific points of view, material editing and cluster transcoding.In the material editing environment, the splitting and merging implementation will place its emphasis on how to implement the‘cut’and‘splice’, known as nonlinear editing operations, on certain video/audio materials. Aiming at P2, current mainstream formats of HD nonlinear editing, the splitting and merging implementation for high rate MXF file and low rate MP4 file will be discussed in succession. For intra-frame compressed MXF files of DV and AVCI formats, the key of the implementation lies on the analysis and record of metadata and efficiency for large files. The process of metadata analyzing and video/audio data location for this kind of materials are introduced in detailed. Multi-thread file I/O scheme is proposed to reduce time-consuming, the experiments are conducted on rewriting block size and the appropriate size is suggested. For inter-frame compressed low rate MP4 files, the frame type transforming based scheme is proposed to reach the frame precision of operations for both low-delay and B-frame contained cases. Compared with the traditional schemes, it has following advandtages: reducing time-consuming dramatically due to avoiding the process of decoding and reencoding whole file; preventing video frames from quality degradation introduced by reencoding.In the cluster transcoding environment, the splitting and merging implementation will focus on splitting video/audio materials in multiple granularity, merging split clips into a smooth and integrated one, which guarantees the cluster transcoding system can effectively collect computing resources and accomplish tasks efficiently. In consideration of the service process characteristic of cluster transcoding system, the dissertation first analyzes the shortage of the physical splitting scheme and a quasi-splitting scheme based on point recording is proposed to adapt the services process of cluster system. The detailed discussion on the demultiplexing and task splitting schemes for MPEG-2 transport stream will be conducted, the splitting strategy based on GOP (Group of Pictures) is determined. Then the dissertation discusses the video/audio clips merging and multiplexing algorithms to guarantee the resynchronization between video and audio data in detailed. Finally, in the environment of cluster system which has 7 computing nodes the experiments are conducted to research the impact of splitting granularity on the performance of cluster transcoding.