节点文献

多特征融合视频复制检测关键技术研究

Research on Key Technologyies of Multi-feature Video Copy Detection

【作者】 陈秀新

【导师】 贾克斌;

【作者基本信息】 北京工业大学 , 电路与系统, 2013, 博士

【摘要】 随着数字视频采集设备的广泛应用和计算机网络技术的飞速发展,网络上的视频数据呈现爆炸性增长,视频复制检测技术能够在众多视频数据中快速高效地检测到具有相同内容的视频信息,因此在数字视频版权保护、视频管理与索引以及媒体跟踪等领域具有巨大的应用需求和重要的应用价值。近年来,基于内容的视频复制检测技术已成为多媒体信息处理领域研究的热点。现有的视频复制检测技术存在运算量大、查全率和准确率低、鲁棒性差、应用范围受限等诸多问题,研究快速高效的视频复制检测方法迫在眉睫。在上述背景下,本文对视频复制检测关键技术进行了深入研究,完成的主要工作与贡献包括:1.提出了基于颜色时序特征曲线的视频复制检测方法(Video CopyDetection Based on Spatial-tempral Color Feature Curves, SCFC-VCD)。针对视频复制检测普遍存在的计算量大的问题,提出了基于时序特征曲线的检测方法。首先,对视频帧进行分割,提取各子区域颜色Y分量和U分量的均值,按照视频帧的先后顺序组成视频的特征曲线;然后,将提取出的特征曲线与待匹配视频的特征曲线进行匹配。为了去除视频亮度和色度整体漂移带来的影响,提出了基于差值曲线的相似性匹配算法;为了去除突变干扰的影响,提出了异常因子(ExceptionFactor)来解决;为了处理不同时间尺度的视频匹配问题,提出了改进的动态时间规划匹配算法。实验结果证明,SCFC-VCD方法运算量非常小、检索速度比一般的方法快,对于广告等画面变化较频繁的视频具有很好的检测效果,且能够抵抗常见的干扰。对于电视剧等画面变化率比较低的视频,SCFC-VCD方法可以快速有效地过滤掉大部分不相关的视频,从而大大减小了后续基于关键帧特征处理的运算量。2.提出了基于三维量化颜色直方图的视频复制检测方法(Three-dimentionalQuantized Color Histogram Method, TQCH)。针对颜色直方图在量化边界处误差大、对颜色变化过度敏感等问题,提出了三维量化颜色直方图方法。首先,对HSV颜色空间的关键帧颜色值进行非均匀量化;然后,统计其颜色直方图。为了降低量化边界处颜色值的误差,对颜色直方图沿H分量方向将相邻的两个值相加,得到三维量化颜色直方图,用来代表关键帧的颜色特征。最后,提出了相应的匹配方法。实验结果表明,TQCH方法有效表示了关键帧的颜色特征,对于常见的彩色图像,其查全率和准确率高于现有的其他颜色特征检索方法,并且对于常见干扰具有鲁棒性。3.提出了基于仿射不变连通区域的视频复制检测方法(ConnectedComponent Based Affine Invariant Region Method,CCB-Affine)。针对现有形状特征提取方法存在的特征数目少、可重复性及鲁棒性差等问题,提出了一种新的仿射不变区域提取和描述方法。在检测子中,首先对关键帧图像预处理;然后,找到关键帧中灰度值相同的点所组成的连通区域,将灰度值差小于阈值的相邻连通区域分别进行合并,取满足条件的最后一次合并结果为仿射不变区域;最后,通过一定的策略将检测结果中不满足条件的区域去除,得到最终的仿射不变区域。在区域描述子部分,基于归一化复数中心矩构造了6个不变矩。实验结果表明,CCB-Affine方法可有效提取图像中的形状特征,并可抵抗包括视角变化在内的各种常见干扰的影响,与其他方法相比具有更好的鲁棒性,且提取的特征数目足够多。4.提出了基于方向可控金字塔二值图像投影的视频复制检测方法(SteerablePyramid Binary Image Projection Method, SP-BIP)。为了提取关键帧图像的多尺度、多方向纹理特征,提出了方向可控金字塔二值图像投影方法。首先,对灰度化后的关键帧图像进行方向归一化,并进行方向可控金字塔分解,对各子带图像通过自适应阈值进行二值化;然后,计算子带图像的归一化行和列投影向量,作为子带图像的纹理特征。在特征匹配上,采用向量相交匹配方法。实验结果表明,SP-BIP方法可有效提取关键帧中的多尺度、多方向的纹理特征,优于小波变换等纹理特征提取方法,并对一些常见干扰具有鲁棒性。5.提出了基于Tri-training的多特征融合视频复制检测方法(Tri-trainingBased Multi-feature Video Copy Detection,TBM-VCD)。为了有效融合视频的多种视觉特征,提出了新的多特征融合方案。通过Tri-training半监督学习方法将视频的颜色、形状和纹理特征进行了有效的融合,弥补了单一特征在应用中的缺点。通过三个分类器的协同训练,提高了视频复制检测的查全率与准确率,扩大了应用范围。实验结果表明,本文提出的视频复制检测方法具有速度快、查全率与准确率高、应用范围广等优点。与使用单一特征的视频复制检测方法相比,TBM-VCD方法的查全率与准确率具有明显优势,很好地满足了视频复制检测的需求。

【Abstract】 With the wide application of video capture devices and the rapid development ofthe Internet technology, video data on the Internet is growing uncontrollably. Videocopy detection method can detect videos with the same content in a large number ofvideos and has great application requirements and broad application prospects in thefield of digital video copyright protection, video management and indexing as well asmedia tracking. Therefore, the Content-based Video Copy Detection has become aresearch hot in the field of multimedia information processing.The existed video copy detection methods have the drawbacks of hugecomputation amount, low recall and precision rate, low robustness, limited applicationdomains and so on. The research of highly efficient video copy detection methods isurgent. Under this background, efficient video copy detection technology is studied inthis paper in the following aspects:1. A method of Video Copy Detection Based on Spatial-tempral Color FeatureCurves(SCFC-VCD) is proposed. This method is proposed to deal with the hugecomputation problem. First, each frame is segmented and average of Y color and Ucolor is computed. Combine corresponding values of average Y and U according tothe frames’ play order to get the video’s color feature curves. Then, the extracted colorfeature curves are matched with those of the target video. In the matching of thefeature curves, in order to remove the impact of luminance and chrominance overallshift, a similarity matching algorithm based on the gradient curves is introduced. Anexception facor is also adopted to remove the impact of abupt interference. To dealwith the matching of videos with different time scales, a method based on improvedDynamic Time Warping is proposed. The experimental results show that SCFC-VCDmethod is small in computation and it is faster than other methods. For videos whosecontent change frequently such as advertisements, the proposed method can detectvideos effectively. It is also robust to common disturbs. For videos whose contenthardly change such as TV series, the proposed method can filter most unrelativevideos quickly which can reduce the computation in the following keyframe-basedprocess.2. A Three-dimentional Quantized Color Histogram (TQCH) method is proposed.Color histogram is sensitive to color changes at quantize edges. TQCH is proposed todeal with this problem. First, HSV color values of keyframes are quantized non-uniformly. Then color histograms are calculated. To decrease the quantized errorat edges, neibor values in H part of histogram is added and the resulting histogram isdefined Three-dimentional Quantized Color Histogram and is used to represent thecolor feature of the keyframe. At last, corresponding matching method is proposed.Experimental results show that TQCH method represent the color featurs of thekeyframe effectively. For commen color images, its recall and precision is higher thanother color-based methods. It is also robust to common disturbs.3. A method of Connected Component Based Affine InvariantRegion(CCB-Affine) is proposed. The existing shape feature extraction methods havedrawbacks of small features, low in repeatness and robustness. A new affine invariantfeature extractor and descriptor is proposed. In the detector, keyframes ispreprocessed and then the pixels with the same grayscal value are connected to form aconnected region. Regions whose gray value difference is smaller than the thresholdare merged. The last merging result is the affine regions. At last, certain methods areused to remove bad regions. In the descriptor,6invariant moments are constructedbased on complex centre moments. Experimental results show that the proposedmethod can detect keyframe shapes effectively and it is also robust to commendisturbs including change in views. It is more robust than other methods and candetect enough shape features.4. A Steerable Pyramid Binary Image Projection (SP-BIP) method is proposed.To get multi-scale and multi-oritation features of the keyframe, SP-BIP is proposed.First, oritation normalization is performed to the grayscale keyframe. The keyframe isperformed Pyramid decomposition. The result sub-images is binarized according theirown thresholds. Then normalized row projection and column projection are computedto represent the texture features. Vector intersect is used to match tow keyframes.Experimental results show that the proposed method can extract multi-scale andmulti-oritation texture features of the keyframe. It is superior to wavelet transformbased method. It is also robust to commen disturbs.5. Tri-training Based Multi-feature Video Copy Detection(TBM-VCD) method isproposed. To fuse different video visual features the new fuse method is proposed.Color feature, shape feature and texture features of videos is effectively fused.Disadvantages of one kind of feature are removed. Through co-training of3calssifiers,video copy detection recall and precision is improved. Experimental results show thatTBM-VCD method has advantages of fastness, high recall and precision and can be used for different kinds of videos. Compared with state-of-the-art methods, it is highin recall and precision and fullfill the needs of video copy detection.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络