

Layer Transmission of Video Stream and Recognition of Text Pattern in Stream

【作者】 丛键

【导师】 李在铭;

【作者基本信息】 电子科技大学 , 通信与信息系统, 2001, 博士

【摘要】 随着网络技术与资源的发展,基于网络的各种视频应用越来越普遍,对于网络视频传输技术提出了更高的要求,因此提高基于各种网络类型的实时视频业务传输质量,成为当前研究的热点领域之一。另一方面,随着以视频为媒体的各种信息量迅速增长,如何准确有效的实现基于内容的视频信息索引成为一个迫切需要解决的课题,而利用视频流中的文字信息来描述视频内容的技术是目前一种很有潜力的解决方案。 本文的研究工作包括两个方面:首先提出了一种视频与图像中数据丢失的重建算法,以此为基础对基于网络的实时视频传输方案进行了研究,提出了信源端分层编码方案、数据流组织方案以及接收端后向处理方案。其次,我们在视频流中文本信息的检测与识别方面开展了研究工作,并提出了相应的理论模型与实现方法,结合到实际应用系统中时,取得了良好的效果。 在基于网络的实时视频传输方面,我们进行了如下的研究并取得了一定的成果。在接收端,我们以变换编码技术为对象,对于图像子块受损的信息重建问题进行了深入分析,建立了利用图像子块边界信号重建丢失信息的模型。同时我们对变换基信号子块边界分量空间进行了分析,并且提出了基信号子块边界分量空间标准正交基的构造方法,以及以此为基础的一种利用利用图像子块边界信号重建子块变换系数的快速算法。我们还介绍了这种技术在消除变换编码方块效应中的成功应用。以接收端丢失信息重建技术为基础,本文对ATM网与分组交换网的传输特性进行了分析,针对ATM网中VBR业务与分组交换网分别设计了相应的信源端分层编码方案以及视频流组织方案,同时在接收端对于分组与信元的丢失问题,提出了相应的丢失信息重建方案。 视频流中文本信息的检测与识别包括3方面的工作:视频流中文本区域的检测与定位;文本区域中字符目标的检测与提取;字符目标的识别。本文在这三方面的研究工作上都作出了一定的贡献。首先我们通过对一般视频图像中,文本信息信号特征的分析,建立了视频流中文字区域的检测模型,并提出了一种利用图像的多尺度模糊处理与小波理论中的多分辨率分析,结合区域整体特征与纹理特征的文本区域检测技术;在字符目标的检测方法研究中,我们建立了一般性目标集合中满足一定规律特征目标子集的检测模型,提出了一种利用文本区域中字符目标空间分布规律的检测方法,并且在实现中我们提出了距离生成矩阵的概念以及利用距离生成矩阵的快速实现技术;字符目标的识别包括提取字符识别特征与识别两个步骤,提取字符识别特征时,我们提出了字符的粗骨架概念以及相应的基于非细化处理的字符骨架特征提取技术,在这种技术中,通过对字符按部件的分解、局部骨架提取、整体骨架连接实现了对字符几何形状在一定尺度上按骨架的描述。根据骨架特征,利 用的图论的理论与方法,我们提出了一种提取包括笔划特征与笔划结构特征的字符 识别特征的提取技术。在识别处理中,我们利用字符的笔划特征并引入模糊识别理 论,提出了一种具有良好抗干扰性的快速字符识别技术。

【Abstract】 The video application based on networks became more popular according to improvement of the technology and resource of network, and requirement for new technology of video transmission via network is imminent. Improving the quality of transmission in network became an active field of research currently. On the other hand, how to index the content of video information exactly and efficiently is a very attractive region of research with the information based on video increasing quickly, and it is a very potential method that indexing content with utilizing the text information in video streams.There are two research works is involved in this thesis. First, an algorithm of data reconstruction in videos and images is proposed. Base on this reconstruction technology, we developed the precept of layer coding in sender, the structure of data stream and the precept of post processing in receiver with analyzing the scheme of real time transmission of video in network. The successful application of based on this technology in block effect reduction of transform coding is also introduced in this paper. Secondly, we researched the detection and recognition technology of text information in video stream, and corresponding realization method is proposed. The technology showed great effect while utilized in real application.With the research on real time transmission of video, we take some progress as follows. At the receiver, we analyzed the problem with reconstruction of information lost in image sub-block based on transform coding of image, and a technology of lost information reconstruction with corresponding fast algorithm of sub-block information reconstruction based on the boundary information of image sub-block is proposed . At the same time, we applied this method into reduction of block effects in transform coding successfully. Based on this technology of reconstruction, we analyzed the transmit characteristic of ATM and packet switching network, then project of layer coding in sender and format of video stream is proposed respectively. At the receiver, the method of lost information reconstruction was proposed for the problem of packet or cell lost in transmission.The research work on detection and recognition of text in video stream includes 3 technologies: 1, the detection and location of the text region invideo stream; 2, the segment and detection of the target character in the text region; 3, recognition of the text information. In this paper, we do some contribution on both 3 regions. With the study of the signal features of text information in general video and image, we developed a recognition model. With the detection of the text region, we produced a detection model and its achieving method with the global and local texture characteristic of text region based on the technology of multi-resolution analysis in theory of wavelet with blur image in multi-scale. With the character object detection, we developed a detection model of general object based on some regularity and a detection technique ot character object in the image is proposed with the regularity of spatial distribution. A fast algorithm is introduced for detection of character object which utilizing the distance generate matrix. In character recognition, we proposed a concept of ’rough skeleton’, and developed a technique of skeleton extracting based on non-thinning process of characters via three steps: character discomposed based component, local skeleton extraction and global skeleton connection. Then utilized these skeleton to form a description with the feature of strokes and stroke structure based on the theory and method of graph theory, and fuzzy recognition technique is introduced to achieve character recognition with excellent robust.
