节点文献

基于内容敏感图像过滤关键技术研究及应用

Research on Key Technologies of Content-Based Erotic Image Filtering and Its Application

【作者】 杨金峰

【导师】 申铉京;

【作者基本信息】 吉林大学 , 计算机应用技术, 2006, 硕士

【摘要】 目前,针对如何防止网络黄毒的侵害这一研究课题常采用网址封锁和敏感关键词匹配技术,存在明显的滞后性和局限性,必须结合图像过滤技术才能更有效的防止黄毒的传播。本文以此为背景,依托于2004年度珠海市科技项目(PC20041101)——“基于内容的敏感图片过滤技术的研究及其在IE浏览器中的实现”,对基于内容的敏感图像过滤中的若干关键技术进行了研究,并在此基础上构造实现了一个有效的敏感图像过滤器,最终将网址封锁、关键词匹配和敏感图像过滤技术结合起来应用于IE浏览器,实现敏感信息的在线检测功能。本文首先构造了一个较完全的实验图库,包括肤色标注掩码库和测试库,在此基础上进行了后续研究。肤色检测模型是本体系结构的核心,本文在标注掩码库上比较分析了三种肤色检测模型,并训练出较好者用于标注图像中的肤色区域。经过初步肤色检测后的掩码图像还需要进行一些必要的辅助处理,包括纹理确认、滤波和去噪等。为了降低类大头贴式肖像类图像的误检率,系统中又引入了人脸检测机制,使用基于AdaBoost的快速人脸检测算法。为了能对两类图像进行有效的分类,本文参考敏感图像自身的特点,结合掩码图像和原图像提取并试验了有效的分类特征,用于作为分类器的输入特征向量。最后通过训练C4.5算法构造出有效的决策树,并将获得的分类规则用于分类特征向量。结合实际应用,本文又将过滤器作为一个插件用于IE浏览器中,实现敏感网页的实时检测和过滤功能。

【Abstract】 The flooding of network eroticism not only badly affects the body and mind healthof teenagers, moreover also brings many inconveniences to the people’s normalusing of Internet. How to keep eroticism off Internet is an important research topic,which has attracted many researchers from domestic and foreign to engage in thisresearch. In domestic, the research of Chinese counter-eroticism software startsfrom the end of last century, up to the present, there are about twenty kinds ofcounter-eroticism software developed for filtering eroticism, and the realization ofsuch softwares depends on the technologies of blocking Web addresses andmatching erotic keywords. With the rapid inflation of the Internet informations,there are obvious lags and shortcoming using the foregoing technologies, so wemust combine the image filtering technology to block the eroticism disseminationeffectively. Founded on “Research on Content-Based Erotic Image Filteringtechnique and its Application in IE” of Zhuhai Science and Technology PlanningProjects in 2004, we study the key technologies of Content-Based erotic imagefiltering, construct and realize an effective erotic image filter, in the end ,wecombine the image filter with the Web addresses blocking and keywords matchingto form a layered page erotic information filter,and we also embed the filter in IEbrowser as a plug-in, filtering the erotic information on-line.This paper discusses several key technologies of Content-Based erotic imagefiltering, after studying the research results that have presented, we design andrealize an effective erotic image filter. The main work of the dissertation is asfollows:(1) We construct a more complete image database, containing a markedskin-mask bank of 1442 images and a test image bank of 15890 images, and signthe images using the classification strategy. All the work we have done in thispaper is based on the image bank.(2) This dissertation analyses skin-color detecting models in common use atthe present time, on the whole, there are two types of study directions forskin-color detecting: one is based on single pixel, the other is based on both singlepixel and neighboring information between pixels. We mainly compares threealgorithms of detecting skin-color pixels--the Chroma Space Algorithm, the ByesClassifier Algorithm based on skin-color statistical histogram and the SeedDiffusion Algorithm based on neighboring information. Considering combinationof the precision and speed, we select the Byes Classifier Algorithm based onskin-color statistical histogram finally. We evaluate the optimal threshold throughestimating the Equal Error Rate and we choose the threshold θ =0.1175 in ourtraining set. Using this decesion threshold, the correctness of skin-color detectingcan achieve to 91.52% on the test set which contains 481 images, the error rate is8.62%, and the detecting time on an image with 574*691 pixels is 0.0738 second.(3) In order to reduce the error rate of classification for portrait imageeffectively, the human face detection mechanism is introduced in the filter.Considering the combination of precision and speed, we use the face detectionmechanism proposed by P.Viola, which combining AdaBoost and Cascadetechnology. In our system, we choose the appropriate parameters throughexperimenting, using the training parameters, the correctness of face detecting is85.56% on the test set which contains 817 images, and the detecting time on animage with 740*784 pixels is 0.6353 second. The results show that the precision ofour system can be improved largely (about 10% on our test set) after adding theface detection mechanism into our erotic image classifier.(4) The feature vector extraction and evaluation for classifying erotic imageand the construction of the Decision Tree classifier. We extract ten features that arepropitious to classifying in all from mask image and the relevant origin imagebefore classifying, and evaluate these features considering their capability ofclassification respectively, then select five features as our character set. Weconstruct an effective Decision Tree Classifier through training 3624 givenexamples using C4.5 algorithm, then use the final rules to classify the featurevector.Experiments and analysis show that our erotic image classifier can identify thebenign image and erotic image effectively, its precision is about 91.13%(while theprecision for erotic image recognition is 76.13%, the precision for benign image is92.68%) on our test set with 5053 images. With a view to the application of ourproject, we add the final classifier into the IE browser as a plug-in using the BHO(Browser Helper Object) technology to filter the erotic information on-line.There are many places of our filtering system need to be improved andperfected, such as more efficient skin-color pixel detecting model, the detection ofspecial parts of human body and the optimization of the system real-time capabilityand so on, these are also our future work.

  • 【网络出版投稿人】 吉林大学
  • 【网络出版年期】2006年 10期
  • 【分类号】TP391.41
  • 【被引频次】7
  • 【下载频次】309
节点文献中: