

An English Text Digital Watermarking Algorithm Based on the Idea of Virus Debris

【作者】 周海燕

【导师】 胡峰松;

【作者基本信息】 湖南大学 , 计算机应用技术, 2007, 硕士

【摘要】 文本数字水印作为版权保护的一种手段得到了越来越多的重视。而由于数字文本所固有的“二值性”问题,没有丰富的纹理,大大增加了文本数字水印的鲁棒性和有效载荷这两个问题的解决难度,从而使得文本数字水印的研究远远落后于其它数字媒体。N.F Maxemchuk等人提出的特征编码的方法能够很好的解决有效载荷问题。然而鲁棒性问题却一直还没有理想的解决方案。现有的大多数文本数字水印算法实质上到最后都是基于格式的(文件格式如:DOC、PDF等等,排版格式如:字体、字号、文字颜色、字间距和行间距等等),只要对文本的格式进行修改,嵌入的水印信息极可能便荡然无存。鲁棒性不强是当前大部分文本数字水印算法的一个通病,也正是这个通病严重地制约了文本数字水印技术的成熟与发展。本文提出了一个“基于病毒碎片思想的英文文本数字水印算法”。该水印算法嵌入位置的基本思想受启发于计算机病毒的分块存储,具体就是把整个英文文本的字符以某些特定字母为界划分成若干小段(元素),再把这些元素按一定规则归类成若干个集合,然后在每一个集合中分别嵌入一个水印信息片。算法嵌入方式的基本思想来源于UNICODE编码集中存在着“外似而内不似”的字符,即字符的形状完全相同,却有着不同的内码。利用这个特点,用形状相同而内码不同的字符进行替代便可达到嵌入信息的目的。检测水印的时候,只要这个集合里有一个元素中的水印载体字符没被破坏,那么这个集合中嵌入的水印信息片就可以被提取出来。由于该算法完全可以在纯TXT文本上做,所以格式攻击对其是无效的。于是从理论上讲,该算法的鲁棒性能得到良好的保证。基于本文提出的算法思想,在.NET + MS Access环境下开发出了一个完整的软件系统。大量的实验证明:该算法的鲁棒性确实能达到理论上的预期效果。本算法可在纯TXT文本上得到实现,自然也可以在格式化的文件中实现,因此可广泛应用于各类英文电子出版物和网页。

【Abstract】 It becomes more and more attractive to protect copyrights by text digital watermarking. Because of the inherent issue (“two values 0 or 1”), there are not rich textures in a digital text. The difficulties to solute the problem of robustness and the valid loading are increased largely. As a result, the researches of text digital watermarking fall behind other digital media largely. The feature coding method made by N ? F Maxemchuk solves the loading problem satisfactorily. However, the robustness problem is not solved ideally. Most of the existing text digital watermarking algorithms are based formats (files formats, such as: DOC, PDF, etc., typesetting formats, such as: the style, the size, the color of the words and the character spacing and line spacing, etc.). The watermarking information embedded in the digital text will be extracted no more, if the formats of a text are changed. It is a common lack that the robustness of most text digital watermarking algorithms is not strong. The maturity and development of text digital watermarking technology are constrained seriously just because of the robustness.This paper presents a novel English text digital watermarking algorithm based on the idea of computer virus. The basic idea of location of the algorithm comes from computer viruses. The specific is: dividing the whole characters of an English text with some specific alphabets into several short paragraphs (elements). These elements are then sorted into several aggregates by specific rules, and then one bit watermarking information is embedded into an aggregate. The basic ideal of the embedding method of the algorithm comes from Unicode code. There are some characters which share the same shapes but are stored in computers by different codes. The watermarking information piece embedded into an aggregate can be extracted well as long as only one elements of the aggregate which carried a bit watermarking information are not destroyed. The algorithm can be done in pure text TXT; attacks based on formats are invalid. Thereupon, the robustness performance of the algorithm has been well ensured theoretically. Actually, the splendid performance in robustness of the algorithm is proved by experiments.A software system based on the proposed algorithm is developed in .NET + MS Access environment. It is proved by extensive experiments that the robustness of the algorithm will be able to achieve the desired effect theoretically. The algorithm can be done in pure TXT text, but also can be achieved in the formatted documents off cause. It can be widely used in various electronic publications in English.

【关键词】 数字水印病毒划分分类
【Key words】 Digital WatermarkingVirusDivideSort
  • 【网络出版投稿人】 湖南大学
  • 【网络出版年期】2007年 05期
  • 【分类号】TP309.7
  • 【被引频次】2
  • 【下载频次】166

