节点文献

感知图像Hash框架、方法及性能测评指标

Perceptual Image Hashing: Framework, Methods, and Performance Evaluation

【作者】 唐振军

【导师】 王朔中;

【作者基本信息】 上海大学 , 信号与信息处理, 2010, 博士

【摘要】 图像Hash(哈希)又称图像摘要和图像标识码,是数字媒体内容安全和多媒体应用的前沿研究课题,可广泛应用于图像认证、篡改检测、图像拷贝检测、图像索引、图像检索、数字水印等方面。图像Hash用一个短小的数字序列表示图像本身,是一种基于图像视觉内容的压缩表达。通常,图像Hash应满足感知鲁棒性、唯一性和安全性。换言之,视觉相似的图像,不管其内部数据是否一致,Hash应以很大概率相同或十分接近,而不同图像的Hash则要求冲突概率接近于0。如果图像内容被恶意篡改,图像Hash应发生重要改变,在密钥未知的情况下,攻击者无法猜测或伪造Hash。本论文主要研究图像Hash框架、方法及性能测评指标。论文首先广泛研究已有的测评指标和图像Hash方法,进而提出一种视觉相似度客观测评指标、两种图像Hash新方法、一种新的图像Hash框架及其实现方案,有效实现了图像Hash的评价与提取。具体而言,本文对如下三个方面进行了深入研究,得到了创新成果:1.建立用于图像Hash的视觉相似度客观评价测度评价图像Hash性能时,要求对两幅图像是否在视觉上相似做出判断,针对这一需求,提出了一种衡量视觉相似程度的客观评价测度。该测度提取图像块结构信息作为特征,不仅能反映正常处理给图像带来的失真,而且对局部内容篡改敏感,在检测局部内容篡改和抗旋转操作方面,优于峰值信噪比和结构相似度指数。2.提出两种用于篡改检测的图像Hash方法篡改检测是图像Hash研究最为重要、也最具挑战性的一个任务。从人类视觉系统和数据降维的角度考虑,分别设计出以下两种Hash方法。利用分块结构特征构造感知图像Hash:通过研究人类视觉系统发现,图像结构特征能从本质上反映图像视觉内容,因此提出利用分块结构特征构造图像Hash。作为Hash系统的一个集成部分,定义了一种新的Hash相似性度量方法,可有效揭示隐含在图像Hash中的篡改操作。基于非负矩阵分解(NMF)的图像Hash方法:NMF是一种有效的数据降维方法,研究发现,正常处理前后NMF系数矩阵中的相邻元素存在大小不变关系,而恶意篡改则会破坏这种关系。根据这一特性,设计了一种高效的系数量化规则。提取Hash时,通过构造二次图像以减少特征向量数,实现初步降维。之后再将NMF应用于二次图像,并用量化规则二值化系数矩阵生成图像Hash。理论分析和实验结果表明,上述两种方法对JPEG压缩、适度的噪声干扰、水印嵌入、高斯低通滤波、亮度与对比度调整、伽玛校正等操作具有良好的稳健性,在冲突概率和篡改检测方面,优于Fridrich方法、RASH方法和NMF-NMF-SQ方法。3.提出基于词典式结构的图像Hash框架及其实现方案词典式Hash框架由两部分组成:(1)词典的构造与维护;(2)图像Hash提取。词典由若干本子词典组成,每本子词典又包含了大量图像块特征向量,即单词。随着训练图像增加,构成词典的单词也随之增多。词典的主要作用在于提供反映图像块的最佳单词以构造图像Hash。提取Hash时先将图像分块,建立图像块与子词典的一一映射关系,在图像块对应的子词典中查找最佳单词并用其表示。串联所有图像块对应的单词得到中间Hash,对中间Hash压缩编码即可生成最终图像Hash。在上述框架下,用DCT和NMF实现了一个词典式Hash方案。实验结果表明该方案具有良好的感知鲁棒性和唯一性。由于词典由大量不同图像块构成,因此攻击者无法伪造完全相同的词典,从框架上保证了Hash安全。与预期一样,当词典规模增大后,可选取更多单词进行相似匹配,从而提升Hash整体性能。但是过多单词只会线性增加Hash的计算代价,而并不会线性提高性能。

【Abstract】 Image hash, also called image digest or image authentication code, is an emerging technology of digital media security and multimedia applications. It can be applied to image authentication, tamper detection, image copy detection, image indexing, content-based image retrieval (CBIR), digital watermarking, and so on. An image hash is a content-based compact representation, which uses a short binary string to denote the image. In general, an image hash should satisfy the requirements of perceptual robustness, uniqueness, and security. Perceptual robustness means visually identical images should have almost the same hash with high probability even if their digital representations are not identical. Uniqueness is also called anti-collision capability, which implies that probability of two different images having an identical hash value, or very close hash values, should tend to zero. Security ensures that image hash should be sensitive to malicious tamper, and can’t be predicted without the knowledge of the keys.This dissertation focuses on the framework, methods and performance evaluation of perceptual image hashing. After investigating the existing similarity metrics and hashing methods, I propose a perceptual similarity metric, two image hashing methods, and a novel hashing framework with an implementation based on discrete cosine transform (DCT) and non-negative matrix factorization (NMF). The contributions of this dissertation are as follows:1. I develop a perceptual similarity metric for application to robust image hashingTo measure perceptual similarity between an original image and its modified version, I propose an objective metric. This metric constructed by block structures can not only indicate the distortion introduced by normal processing, but also identify the local tampering. It shows better performance than PSNR and mean SSIM index in sensitivity to malicious tamper, and rotation resistance. 2. I propose two image hashing methods for tamper detectionTamper detection is an important task and challenging topic of image hashing. To this end, I design two methods based on human visual system and a technique of data reduction, respectively.Structural feature-based image hashing: since structural feature can represent the visual appearance of image, I exploit the structural features of blocks to construct robust image hashes. As an integrated part of the hashing algorithm, I define a new similarity metric that fully explores both perceptual robustness and anti-tampering sensitivity intrinsic in the obtained image hash.NMF-based image hashing: NMF is an effective technique of data reduction. I find that most pairs of adjacent entries in the NMF’s coefficient matrix are basically invariant to ordinary image processing, but changed when tamper occurs. Base on the observation, a coarse quantization scheme is devised to compress the extracted features contained in the coefficient matrix. In hash generation, a secondary image is constructed to achieve the initial data reduction by using fewer vectors to represent the original image. NMF is then applied to the secondary image, and the quantization rule is exploited to make the coefficient matrix binary and then form the final hash.Theoretical analysis and experimental results show that the two methods above are both robust against perceptually acceptable modifications to the image such as JEPG compression, moderate noise contamination, watermark embedding, Gaussian filtering, brightness and contrast adjustment, gamma correction, and scaling. They show better performances than Fridrich’s method, RASH method, and NMF-NMF-SQ scheme both in collision capability and tamper detection, indicating the usefulness of the techniques in digital forensics.3. I design a lexicographical framework for image hashing, and give an implementation based on DCT and NMFA lexicographical-structured framework to generate image hashes is proposed. The system consists of two parts: dictionary construction and maintenance, and hash generation. The dictionary is a large collection of feature vectors called words, representing characteristics of various image blocks. It is composed of a number of sub-dictionaries, and each sub-dictionary contains many features, the number of which grows as the number of training images increase. The dictionary is used to provide basic building blocks, namely, the words, to form the hash. In the hash generation, blocks of the input image are represented by features associated to the sub-dictionaries. This is achieved by using a similarity metric to find the most similar feature among the selective features of each sub-dictionary. The corresponding features are combined to produce an intermediate hash. The final hash is obtained by encoding the intermediate hash.Under the above framework, I implemented a hashing scheme using DCT and NMF. Experimental results show that the proposed scheme is resistant to normal content-preserving manipulations, and has a very low collision probability. Since the dictionary is constructed using a very large quantity of source images, it is virtually impossible to duplicate, and then make image hashes secure. As expected, a large dictionary, and taking more words from the sub-dictionaries for feature matching can lead to better performance. However, using too many words in the sub-dictionaries does not provide the performance advantage in a linear fashion, but only increases the computation burden linearly.

  • 【网络出版投稿人】 上海大学
  • 【网络出版年期】2011年 03期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络