节点文献

基于内容图像检索与敏感图像过滤的若干算法研究

Research on Some Algorithms of Content-based Image Retrieval and Sensitive Image Filtering

【作者】 孙艳

【导师】 王钲旋;

【作者基本信息】 吉林大学 , 计算机应用技术, 2011, 博士

【摘要】 本文主要研究“基于内容的图像检索技术”和“基于内容的敏感图像过滤技术”,在理论分析的基础上,对相应算法进行深入研究。1.在基于内容的图像检索方面为了提高相关反馈的效率,本文提出了基于相关反馈和协同过滤的图像检索算法。利用协同过滤方法分析反馈日志文件,从而预测数据库中图像与检索样本之间的语义相关性。实验结果表明,本文算法在检索精度上明显优于使用完全基于图像视觉特征进行反馈的检索方法。本文算法在第1次反馈后所达到的检索精度就接近了传统方法通过5次反馈所能到达的检索精度。本文算法只需3次反馈就基本上达到了系统的最高检索精度,并且反馈过程中本文方法的检索精度与传统方法相比具有较好的稳定性。2.在基于内容的敏感图像过滤方面为了提高敏感图像过滤的精度和效率,本文提出了一种基于Gabor函数和多层次识别的敏感图像过滤算法。该算法在利用统计颜色模型对待检图像进行肤色检测的基础上,采用Sobel算子与Gabor滤波器相结合的方法提取图像的肤色特征和纹理特征,并利用RBF神经网络和支持向量机对敏感图像进行多层次过滤识别。实验结果表明,该算法对“含色情内容的图像”和“不含色情内容的图像”均具有较好的识别过滤效果,其正检率高于91%,误检率低于14%。

【Abstract】 The rapid development of the Internet make people can easily achieve the transformation and sharing of the mass information resources, which brings great convenience to the production and the exchange of information, and thus plays a huge role in global economical and cultural exchange. With the rapid development of computer multimedia technology and the popularization of image acquisition devices, we have entered the digital age. The information in the form of digital images sharply increases in the network, and reaches the mass storage level. In addition to the realization of information transmission and sharing, people also desire the rapid retrieve in the ocean of image information for the target image of interest. Therefore, the image retrieval techniques have emerged.The text-based image retrieval technique is one of the early techniques. The images are firstly manual labeled with name, capturing date, capturing location, photographer name and other descriptive text notes, then people could query the image desired based on the labeled notes. This kind of text-based image retrieval techniques relies more on people’s subjective understanding of the image, ignoring the information from the content of the image itself. Therefore, it is inevitable to be affected by some uncertain and subjective factors while labeling images. Furthermore, with the rapid growth in the number of images, the contents are much more colorful, and the fields involved are increased. The scheme of manually labeling requires a lot of labor, and moreover, the note still can not completely and accurately descript the image content.On the other hand, the traditional text-based retrieval system can not effectively manage the images retrieved. Therefore, the effective retrieval and management technologies for large amount and complex image data are desired. Thus, the content-based image retrieval technology has become a hot research topic in recent years.Another problem that the explosive growth of image number and type has brought is the rapid spread of sex, violence and other sensitive images. Due to the strong visual impact, these pornographic, violence and other sensitive images have become the objects that criminals disseminating. Through the Internet, which is a cross-regional, cross-border and open form of communication, the harmful effect will cover all corners of the world, and bring a serious toxic effect to social stability, people’s daily life, especially physical and mental health of young people. Therefore, it is desirable to establish a complete and effective system of sensitive technology to filter this kind of images. Thus, the content-based image filtering technology has become the focus of researchers.Focusing on two hot topics, namely“content-based image retrieval”and“sensitive image content-based filtering technology”, this paper studied the image retrieval and filtering algorithms in-depth based on the relevant theoretical analysis. Furthermore, the effectiveness and superiority of the algorithms proposed are validated by experiments in this paper. The work of this paper can be summarized as two following aspects:Content-based image retrieval technology: In order to solve the problem of semantic gap, relevance feedback is introduced into image retrieval. Relevance feedback is repeated interaction process between user and system. How to improve the feedback efficiency and reduce the number of interactions is the key point of relevance feedback technology. Therefore, a retrieval algorithm based on relevance feedback and collaborative filtering is proposed in this paper. First, the user submits retrieval example image. Second, the system extracts image features from the color coherence, and returns search results. Then, the user returns the advices on the retrieval results to the system, and submits the relevance feedback image list. The system extends the feedback sample set that user submitted using collaborative filtering method, and computes the image physical features, the weight of each component, and the image similarity. Finally, the retrieved images are sorted according to the similarity, and the retrieval results are outputted. The experimental results have shown that the proposed method is clearly superior to the retrieval method merely based on the image visual features. Furthermore, the feedback efficiency obviously increased through extending the feedback sample set with collaborative filtering. The retrieval accuracy after the first feedback is close to the accuracy that traditional method can achieve after 5 feedbacks. Our method is able to achieve the approximate highest retrieval precision after only 3 feedbacks, and the feedback process of our method has a higher stability than traditional method.Content-based sensitive image filtering technology: To further enhance the image filtering accuracy and effectiveness, a sensitive image filtering algorithm based on Gabor filtering and multi-level identification was proposed in this study. Firstly, use statistical color model to detect skin color of the input image; Secondly, for“the suspected skin areas”generated from the above step, use Sobel operator to detect edge, so some false color areas can be removed, on this basis, extract skin color features (skin color areas in the proportion of the entire image, the number of connected skin color areas, the largest connected skin color area in the proportion of the entire image); Thirdly, for each of the rest skin areas, divide it into some small blocks with size of 8×8, and use Gabor filtering to extract the Gabor transform coefficients of every block; Fourthly, according to the above coefficients, use RBF neural network to identify the skin textures, and statistic them in the proportion of the total skin area as the skin texture feature of the image. Finally, set the above skin color features and skin texture feature to be the feature vector of the image, and use the Support Vector Machine (SVM) to filter and recognize the sensitive images. Experimental results showed that, for both“with pornography images”and“non-pornographic images”, our algorithm has good effect of identifying and filtering, and has a positive detection rate of more than 91% and false alarm rate of less than 14%. In other words, there is an appropriate compromise between positive detection rate and false alarm rate.In summary, this paper mainly focused on the research on content-based image retrieval algorithm and content-based sensitive image filtering algorithm. Some research results have been achieved in the theory exploration and algorithm application in this paper, which would actively promote the progress of image retrieve and image filtering technologies.

  • 【网络出版投稿人】 吉林大学
  • 【网络出版年期】2012年 05期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络