节点文献

数字图像篡改鉴定的数学特征研究

Research in Mathematical Features of Digital Image Forensics

【作者】 吴小媚

【导师】 李叶舟;

【作者基本信息】 北京邮电大学 , 应用数学, 2013, 硕士

【摘要】 基于数字签名和基于数字水印的取证技术合称为主动取证技术,是早期数字图像取证研究的热点。随着研究的深入,后来出现了被动取证技术,该技术不需要事先对图像做任何嵌入信息处理,其取证原理就是寻找图像篡改操作引起变化的特征,并运用相关数学理论量化特征变化程度,进而对图像是否存在篡改做出判断。图像被动取证技术具有很大的挑战性,是当下图像取证的研究热点。本文在介绍图像被动取证技术的产生历史、分析取证技术在现实生活中的意义和总结国内外研究成果的基础上,开展了图像中汉字变造篡改鉴定技术的研究工作。系统分析了基于相机标定的文字取证技术和现有文字图像取证模型,重点研究由篡改汉字图像操作引起变化的数学特征,并将该特征作为图像中汉字真实性的鉴定依据。同时,为了扩展现有取证技术的适用范围,本文对比数码相机拍摄的文字图像与编辑软件获得的文字图像之间的差异,提出了用数学理论量化差异值,引入支持向量机训练分类模型,结合分类模型实现对文字图像真实性鉴定的取证方法。本文的研究成果包括以下内容:1、在相机标定的基础上,利用汉字具有方块、基本笔画交汇处多的特点,引入汉字模型来估计汉字图像的投影规则。由于使用的汉字模型与实际被拍摄汉字的尺寸存在比例关系,因此估计出的投影规则和真实的单应性矩阵相差一个常数倍数。这一估计方法打破了传统相机标定需要己知标定物体一定尺寸信息的局限性。引入图像汉字重构的思想,并提取汉字一定数量的笔画交汇点来代表汉字,将重构的汉字与相应的汉字模型对比,利用多个交汇点处的差异均值描述投影偏离程度。估计多幅己知真实性的汉字图像偏离值,通过拟合偏离值曲线的方法来确定实验阈值,对于偏离程度大于实验阈值的汉字判定为篡改汉字。2、为克服现阶段文字取证技术要求文字所在面为平面,局限于整个文字篡改的检测和对于篡改文字局部的鉴定存在困难等的局限性,本文研究了数码相机拍摄图片与图像编辑软件获取的文字图片之间的差别,运用峰态、差分等数学理论量化差异程度。提出拆分文字的思想,从图像中分割出怀疑的区域,提取汉字笔画边缘点的特征。在大量己知真实性的文字图片的基础上,使用支持向量机训练分类模型,实现对文字图像的真实性鉴定。

【Abstract】 The image forensics based on signature and technology based on watermarking are together called active forensics, which was the research focus of digital image forensics, and then the passive forensics technology emerge. Passive forensics technology don’t need embedded information before forensics, but need to find changes that caused by tampering, and trying to quantify the degree of changes with math theory, and then make a judgment whether the image exists tampering. Passive image forensics technology is the focus of image forensics area now.An introduction of the history of image forensics, a detailed analysis of forensics significance in real life and summary of research results of home and abroad are given, and our research work are based on these. With the camera calibration technique, we analyze text image forensics model and study the mathematical characteristics which changes caused by tampering, and use such characteristics as the identification to detect Chinese characters’authenticity. Study of the difference between the image obtained by camera and image editing software, we use of mathematical theory to quantify the difference in value, with support vector machine to get classification model. We achieve detecting the authenticity image identification, and expansion of existing forensic technology scope. The research results include the following:Based on camera calibration theory, using characteristics of Chinese characters with flat, strokes interchange, we introduced Chinese characters model to estimate the projection rule of Chinese characters in digital image. Since there is size of proportional between Chinese characters model and the actual Chinese character, and is constant difference between estimated projection rules and the homography. The estimation method to break the limitations that the traditional camera calibration needs to known a certain size of calibration object. Introduced the idea of Chinese characters reconstructed, and to extract a certain number of strokes in the intersection represent Chinese characters, compare reconstructed Chinese characters with the corresponding character model, we use the differences mean of more than one intersection to describe the degree of projection deviation. Projection deviation value of multiple character images that known authenticity is estimated. With the method of depicting the projection deviation value curve to determine threshold, when the projection deviation is greater than the threshold, we judge that image was tampered.The limitations of text image forensics technology in the actual operation are that the text surface is flat, and has difficulties to detect tampering part text. In order to break these limitations, we study the differences between digital camera photo and image obtained by editing software, we use kurtosis, differential and so on to quantify the degree of differences. Using the idea of splitting the text to solve the problem that the existing forensics technology can’t detect text part tampering, suspected from the image divided stroke region, we extract characteristics of strokes of the edge point. On the basis of a large number pictures that known authenticity, we use support vector machine to training classification model, and then to achieve authenticity of text image the identification.

节点文献中: