èŠ‚ç‚¹æ–‡çŒ®

åŸºäºŽå›¾åƒå†…å®¹çš„æˆäººå›¾åƒæ£€æµ‹

Pornographic Image Detection Based on Image Content

åˆ†é¡µä¸‹è½½
åˆ†ç« ä¸‹è½½
æ•´æœ¬ä¸‹è½½
åœ¨çº¿é˜…è¯»
ä¸æ”¯æŒè¿…é›·ç‰ä¸‹è½½å·¥å…·ï¼Œè¯·å–æ¶ˆåŠ é€Ÿå·¥å…·åŽä¸‹è½½ã€‚

ã€ä½œè€…ã€‘ çŽ‹å®‡çŸ³ï¼›

ã€ä½œè€…åŸºæœ¬ä¿¡æ¯ã€‘ å“ˆå°”æ»¨å·¥ä¸šå¤§å¦ ï¼Œ è®¡ç®—æœºåº”ç”¨æŠ€æœ¯ï¼Œ 2009ï¼Œ åšå£«

ã€æ‘˜è¦ã€‘ å½“å‰,äº’è”ç½‘å·²ç»æˆä¸ºäººä»¬ç”Ÿæ´»ä¸å¯†ä¸å¯åˆ†çš„é‡è¦ç»„æˆéƒ¨åˆ†ã€‚ä½†åŒæ—¶åœ¨äº’è”ç½‘æµ·é‡çš„å›¾åƒä¸,å‡ºçŽ°äº†å¤§é‡æœ‰å®³çš„æˆäººå›¾åƒã€‚æ£€æµ‹å¹¶è¿‡æ»¤äº’è”ç½‘ä¸Šçš„æˆäººå›¾åƒ,å·²ç»æˆä¸ºå„å›½ç ”ç©¶è€…æ—¥ç›Šå…³æ³¨çš„ä¸€ä¸ªç´§è¿«é—®é¢˜ã€‚å…¶ä¸,å¤šæ•°çš„ç ”ç©¶è€…é‡‡å–äº†åˆ†æžå›¾åƒå†…å®¹çš„æ–¹æ³•æ¥è¯†åˆ«æˆäººå›¾åƒã€‚è¿‘äº›å¹´æ¥æå‡ºçš„è¯†åˆ«ç®—æ³•å¤§å¤šéƒ½å»ºç«‹åœ¨å„ç§ä½Žå±‚å›¾åƒç‰¹å¾çš„åŸºç¡€ä¹‹ä¸Š,ä¾‹å¦‚é¢œè‰²ã€çº¹ç†å’Œè‚¤è‰²åŒºåŸŸç‰æ–¹é¢çš„ç‰¹å¾ã€‚è¿™ç±»æ–¹æ³•ä¼šäº§ç”Ÿå¤§é‡çš„è¯¯æ£€,ç‰¹åˆ«æ˜¯å½“å›¾ç‰‡ä¸åŒ…å«äº†å¤§é¢ç§¯ç±»ä¼¼è‚¤è‰²çš„åŒºåŸŸ,ä¾‹å¦‚äººç‰©ç±»å›¾åƒè¢«è¯¯åˆ¤çš„æƒ…å†µå°±å¾ˆæ™®éã€‚äºŽæ˜¯æœ¬æ–‡å±•å¼€äº†å¯¹å›¾åƒå†…å®¹çš„æ·±å…¥åˆ†æž,è¯•å›¾å‡å°‘å¯¹è‚¤è‰²æ£€æµ‹ç»“æžœçš„ä¾èµ–,å¹¶é™ä½Žè¯¯æ£€çŽ‡ã€‚æœ¬æ–‡é€‰æ‹©å›¾åƒçš„å±€éƒ¨ç‰¹å¾ä¸ºçªç ´å£ã€‚åœ¨ç³»ç»Ÿä¸,å°†å±€éƒ¨ç‰¹å¾é‡åŒ–ä¸ºè§†è§‰å•è¯,æ®æ¤å¯ä»¥é«˜æ•ˆåœ°åˆ†æžå›¾åƒçš„ä¸Šä¸‹æ–‡è¯ä¹‰,å¹¶ç»“åˆå…¶å®ƒæ–¹é¢çš„ä½Žå±‚ç‰¹å¾,å¯¹å›¾åƒçš„ç±»åˆ«ç»™å‡ºç»¼åˆçš„åˆ¤æ–ã€‚è®ºæ–‡çš„å…·ä½“ç ”ç©¶å†…å®¹å¦‚ä¸‹:é¦–å…ˆå¯¹å„ç§é‡è¦çš„ä½Žå±‚ç‰¹å¾å±•å¼€äº†çš„ç ”ç©¶,å¹¶è¿›è¡Œäº†ç›¸åº”çš„æ”¹è¿›ã€‚æ‰€è€ƒå¯Ÿçš„ç‰¹å¾åŒ…æ‹¬:é¢œè‰²ã€è‚¤è‰²åˆ†å¸ƒã€å±€éƒ¨ç‰¹å¾ã€ä»¥åŠè¾¹ç¼˜çº¿æ¡ç‰¹å¾ã€‚å¯¹äºŽå„ç§ç‰¹å¾,éƒ½æŽ¢è®¨äº†ä¸åŒçš„æ–¹æ¡ˆåœ¨æˆäººå›¾åƒæ£€æµ‹ä¸çš„æ•ˆæžœã€‚å¹¶é’ˆå¯¹ä¼ ç»Ÿæ–¹æ³•çš„ä¸è¶³,æå‡ºäº†æœ‰é’ˆå¯¹æ€§çš„æ”¹è¿›,ä¸»è¦åŒ…æ‹¬:é€šè¿‡ç»Ÿè®¡è‚¤è‰²å—å±€éƒ¨æ¨¡å¼çš„å‡ºçŽ°è§„å¾‹,æå‡ºäº†ä¸€ç§æ–°çš„æè¿°è‚¤è‰²åˆ†å¸ƒçš„ç‰¹å¾;åœ¨å±€éƒ¨ç‰¹å¾æ–¹é¢,åŒæ—¶ä½¿ç”¨äº†å±€éƒ¨çš„å½¢æ€å’Œçº¹ç†ä¿¡æ¯,å¹¶é€‚åº¦ç®€åŒ–äº†å±€éƒ¨ç‰¹å¾ç‚¹çš„é‡‡é›†ç®—æ³•;æ¤å¤–,å¯¹äºŽå±€éƒ¨ç‰¹å¾æ— ç›‘ç£é‡åŒ–ä¸æ‰€äº§ç”Ÿçš„éšæ„æ€§,é€šè¿‡è°ƒæ•´å’Œé™åˆ¶å±€éƒ¨ç‰¹å¾é‡åŒ–ç°‡é›†çš„åŠå¾„,æé«˜äº†é‡åŒ–ç»“æžœ(å³â€œè§†è§‰å•è¯â€)çš„è´¨é‡;å¯¹å›¾åƒä¸çº¿æ¡çš„åˆ†å¸ƒæƒ…å†µ,ä»¥å±€éƒ¨çŸçº¿æ®µä¸ºåŸºç¡€,å»ºç«‹äº†æ—‹è½¬ä¸å˜çš„æè¿°ã€‚å®žéªŒè¯æ˜Ž,ä¸Šè¿°ç‰¹å¾äº§ç”Ÿäº†æ›´å¥½çš„è¯†åˆ«èƒ½åŠ›ã€‚ç„¶åŽä»¥æ™®é€šçš„è§†è§‰å•è¯ä¸ºåŸºç¡€,å»ºç«‹äº†å¯¹æˆäººå›¾åƒè§†è§‰å•è¯ä¸Šä¸‹æ–‡çš„å¤šå±‚æè¿°ä½“ç³»ã€‚è¯¥ä½“ç³»æ€»å…±åˆ†3ä¸ªå±‚æ¬¡,é™¤äº†æ™®é€šçš„è§†è§‰å•è¯,è¿˜åŒ…æ‹¬:ä¸é—´å±‚çš„è¯ç»„,ä»¥åŠæ›´é«˜å±‚çš„å…´è¶£åŒºåŸŸ(region of interest,ROI)è¯é¢˜ã€‚è¯ç»„æ˜¯è§†è§‰å•è¯çš„å±€éƒ¨ç›¸é‚»å…³ç³»çš„æè¿°æ¨¡åž‹,æœ¬æ–‡å»ºç«‹äº†ä¸€ç§ç®€å•è€Œé«˜æ•ˆçš„å±€éƒ¨è¯ç»„ç”Ÿæˆç®—æ³•ã€‚ROIè¯é¢˜åˆ™ç”¨äºŽåœ¨æ›´å¤§çš„å°ºåº¦ä¸Š(ROI)æè¿°æˆäººå›¾åƒä¸è§†è§‰å•è¯çš„ä¸Šä¸‹æ–‡å…³ç³»ã€‚åœ¨å®žéªŒä¸å‘çŽ°,é«˜å±‚çš„è§†è§‰å•è¯é™ä½Žäº†æ™®é€šå•è¯çš„æ§ä¹‰æ€§,å¹¶æé«˜äº†å¯¹æˆäººå›¾åƒçš„è¯†åˆ«æ€§èƒ½ã€‚æ¤å¤–,è¿˜æå–äº†æ•æ„Ÿå•è¯åˆ†å¸ƒç‰¹å¾,ä»Žè€Œè¡¥å……äº†å¯¹è§†è§‰å•è¯çš„å…¨å±€åˆ†å¸ƒä¿¡æ¯çš„æè¿°ã€‚æœ€åŽ,å°†åç©ºé—´å¦ä¹ çš„æ€æƒ³èžå…¥åˆ°ç®—æ³•ä¸,é€šè¿‡å‘é‡æ˜ å°„,ä¸ä½†ä½¿å›¾åƒç‰¹å¾å‘é‡å¾—ä»¥æ˜¾è‘—é™ç»´,è€Œä¸”ä½¿å›¾åƒçš„è¯ä¹‰è·ç¦»å’Œç©ºé—´è·ç¦»æ›´ä¸ºåè°ƒã€‚é€šè¿‡ä¸Šè¿°å„é¡¹å¯¹è§†è§‰å•è¯å‡ºçŽ°è§„å¾‹çš„å¤šå±‚æ¬¡åˆ†æž,æœ‰æ•ˆåœ°æé«˜äº†æˆäººå›¾åƒçš„è¯†åˆ«å‡†ç¡®çŽ‡ã€‚å®žéªŒç»“æžœè¯æ˜Ž,ç›¸æ¯”äºŽä¼ ç»Ÿç±»åž‹æ–¹æ³•,åŸºäºŽè§†è§‰å•è¯çš„æ–¹æ³•ä¸å†ä»Žæ ¹æœ¬ä¸Šä¾èµ–è‚¤è‰²æ£€æµ‹,ä»Žè€Œæ˜Žæ˜¾åœ°é™ä½Žäº†è¯¯æ£€çŽ‡,å°¤å…¶æ˜¯åœ¨äººç‰©ç±»å›¾åƒä¸æ•ˆæžœæ›´åŠ æ˜Žæ˜¾ã€‚åŸºäºŽä¸Šè¿°çš„å¤šå±‚æè¿°ä½“ç³»,æå‡ºäº†ä¸€ç§èžåˆäº†è§†è§‰å•è¯ä¸Šä¸‹æ–‡çš„å›¾åƒæ ¸å‡½æ•°ã€‚è¯¥æ ¸å‡½æ•°ä»¥å•è¯å’Œè¯ç»„çš„å¤šç²’åº¦ç›´æ–¹å›¾é‡‘å—å¡”ä¸ºåŸºæœ¬æ¡†æž¶,åˆ©ç”¨ç›´æ–¹å›¾çš„äº¤è¿ç®—æ¥è®¡ç®—å›¾åƒçš„ç›¸ä¼¼æ€§,å¹¶åœ¨å…¶ä¸èžå…¥äº†å„ä¸ªå•è¯æ‰€å¤„çš„ä¸Šä¸‹æ–‡ç±»åˆ«ä¿¡æ¯ã€‚å®žéªŒç»“æžœæ˜¾ç¤º,ä¸è®ºæ˜¯åœ¨ä¸€èˆ¬æ„ä¹‰çš„å›¾åƒè¯†åˆ«ä¸,æˆ–æ˜¯åœ¨æœ¬æ–‡æ‰€è®¨è®ºçš„æˆäººå›¾åƒè¯†åˆ«ä¸,å‡å¯ä»¥å€ŸåŠ©è¿™ç§æ ¸å‡½æ•°æ¥æé«˜æ”¯æŒå‘é‡æœº(support vectormachine,SVM)çš„è¯†åˆ«æ€§èƒ½ã€‚è€ƒè™‘åˆ°åŸºäºŽä¸Šè¿°æ ¸å‡½æ•°çš„æ£€æµ‹æ–¹æ³•å…·æœ‰è¾ƒé«˜çš„è®¡ç®—å¤æ‚åº¦,äºŽæ˜¯åˆæå‡ºä¸€ç§å°†æ ¸å‡½æ•°ä¸Žå±€éƒ¨å¦ä¹ ç›¸ç»“åˆçš„è¯†åˆ«ç®—æ³•ã€‚è¯¥ç®—æ³•ä½¿ç‰¹å¾ç©ºé—´ä¸æˆäººå›¾åƒæ¨¡å¼çš„åˆ†æžå˜å¾—å°½å¯èƒ½å±€éƒ¨åŒ–,ä»Žè€Œå¯ä»¥åªä½¿ç”¨ä¸€å¹…å›¾åƒé‚»è¿‘çš„è®ç»ƒæ•°æ®æ¥å¯¹å…¶è¿›è¡Œåˆ†ç±»ã€‚é¦–å…ˆåˆ©ç”¨ä¸€äº›æ™®é€šçš„ç‰¹å¾å°†å›¾åƒåˆ†æˆè‹¥å¹²ç»„;è€ŒåŽåœ¨å„ç»„çš„è®ç»ƒæ•°æ®ä¸é‡‡é›†äº†éƒ¨åˆ†æœ‰ä»£è¡¨æ€§çš„æ•°æ®ç‚¹ä½œä¸ºä»£è¡¨ç‚¹;ç»§è€Œåœ¨å„ä»£è¡¨ç‚¹é‚»åŸŸå†…å»ºç«‹äº†åSVMåˆ†ç±»å™¨,å¹¶ä¾æ®å„ä¸ªåSVMçš„è¯†åˆ«æ€§èƒ½å¯¹å…¶èµ‹ä»¥ç›¸åº”çš„æƒé‡;æœ€ç»ˆåˆ©ç”¨æµ‹è¯•å›¾åƒçš„kä¸ªè¿‘é‚»åSVMæ¥å…±åŒåˆ¤æ–å›¾åƒçš„ç±»åˆ«ã€‚åœ¨å®žéªŒä¸è¯æ˜Ž,è¿™ç§åŸºäºŽå±€éƒ¨ç©ºé—´åˆ†æžçš„ç–ç•¥ä¸ä½†æœ‰æ•ˆåœ°æŽ§åˆ¶äº†è®¡ç®—å¤æ‚åº¦,è€Œä¸”èƒ½å¤Ÿå‡†ç¡®åœ°è¯†åˆ«æ•£å¸ƒäºŽå„ä¸ªå±€éƒ¨ç©ºé—´ä¸çš„æˆäººå›¾åƒã€‚æœ¬æ–‡å……åˆ†åˆ©ç”¨äº†æˆäººå›¾åƒä¸å„ç§ç±»åž‹çš„ä¿¡æ¯,å…¨é¢åœ°åˆ†æžäº†å›¾åƒçš„è¯ä¹‰,ä»¥è§†è§‰å•è¯ä¸ºåŸºç¡€,å‘å±•å‡ºäº†ä¸€å¥—å®Œæ•´çš„è¯†åˆ«ç–ç•¥ã€‚ç³»ç»Ÿçš„æ£€æµ‹æ€§èƒ½æ˜Žæ˜¾åœ°è¶…è¶Šäº†ä¼ ç»Ÿç±»åž‹çš„æˆäººå›¾åƒæ£€æµ‹æ–¹æ³•,åœ¨ä»¥å¾€éš¾ä»¥å‡†ç¡®è¯†åˆ«çš„å›¾åƒä¸,é”™åˆ¤å¤§ä¸ºå‡å°‘ã€‚æ›´å¤š è¿˜åŽŸ

ã€Abstractã€‘ Internet has been an important part of our life. However, in the sea of webimages, there are a large number of pornographic ones. The urgent task of detect-ing and filtering those harmful images attracts more and more attention of researchersthroughout the world. Most of the researchers try to find pornographic images throughanalyzing image content. Traditional approaches proposed in recent years are basedon simple, low-level visual features such as color, texture, skin regions, etc. Thosesystems generate many false positives when they detect benign images with large re-gions of skin-like colors, for example, human images. The dissertation aims to providea deeper understanding about image content and create detection systems which areless dependent on skin detection to generate much fewer false positives.This dissertation analyzes pornographic images mainly based on local features.Local features can be coded as visual words by which the context of images can beconveniently expressed and analyzed. Images will be classified based on both visualwords and other low-level features. The detailed descriptions of the methods are asfollows.First, a comprehensive research is done on different types of image features,including color, edge, skin region distribution, and local features. Discriminative fea-tures are selected or proposed for each feature type to re?ect the characteristics ofpornographic images. The dissertation proposes to represent the skin region distribu-tion with the occurrence statistics of local uniform patterns of skin blocks. For thevisual words, local appearances and textures are extracted as local features. The ex-traction of local features is simplified to speed up the computation. Moreover, therandomness of the construction of visual words is reduced by adjusting the clustersâ€™radii in the unsupervised coding of local features. Finally, a group of rotation-invariantfeatures of edge distribution are developed based on short line segments. Experimen-tal results show that all of these features are more discriminative for pornographicimage detection.Next, a multi-level image representation is constructed based on visual words.The model comprises three levels: word, phrase, and ROI (region of interest) topic. An effective method is proposed to construct phrases which code the co-occurrencepatterns of neighboring words. ROI topics reveal the context of words in a largerscale. It is proved in the experiments that the higher level representation can reducethe ambiguity of common words. Thus the pornographic images can be detected moreaccurately. To describe the global distribution of words, the author also developsthe distribution features of pornography-related words. At last, with the means ofsubspace learning, the multi-level representation is projected into a low dimensionalspace to bridge the gap between the semantic similarities and geometric distances ofimage pairs. After the multi-level analysis of visual words, results of skin detectionare no longer crucial in the detection. The proposed method outperforms traditionalmethods, especially in human images.Based on the above multi-level representation of visual words, the dissertationproposes a novel kernel which fuses multi-level context of visual words. The authorconstructs multi-resolution histogram pyramids of words, phrases, and the classes ofthe ROIs in which they are located. Then the similarity of image pairs is evaluatedby the intersection of the histograms. Experimental results demonstrate that supportvector machines (SVM) using the kernel can perform better in not only pornographicimage detection but also general image-classification problems.Considering the high computational cost of the image classification based on thenovel kernel, the author proposes to integrate the kernel into local learning. Then,the pattern analysis of pornographic images is limited to corresponding local featuresubspaces. In other words, an image can be classified only based on a small part oftraining images close to it. First, images are grouped according to low-level features.Second, in each group the algorithm selects one part of representative training dataaround each of which a local SVM is constructed. Then, these SVMs are weightedby their classification performances. Finally, a test image is classified by its k nearestSVMs jointly. It is shown in the experiments that the system has a lower compu-tational cost and accurately recognize pornographic images distributed in differentsubspaces.In this dissertation the author makes use of various information in pornographicimages and gives a comprehensive semantic analysis. Based on visual words, a com-plete detection strategy is developed. Compared with baselines, the proposed systemshave better performance, particularly in the images which are difficult to be classified by traditional methods.æ›´å¤š è¿˜åŽŸ

ã€å…³é”®è¯ã€‘ æˆäººå›¾åƒï¼› å›¾åƒåˆ†ç±»ï¼› è§†è§‰å•è¯ï¼› å¤šå±‚æè¿°ï¼› æ ¸å‡½æ•°ï¼› å±€éƒ¨å¦ä¹ ï¼›
ã€Key wordsã€‘ pornographic imageï¼› image classificationï¼› visual wordï¼› multi-level rep-resentationï¼› kernelï¼› local learningï¼›

ã€ç½‘ç»œå‡ºç‰ˆæŠ•ç¨¿äººã€‘ å“ˆå°”æ»¨å·¥ä¸šå¤§å¦

ã€åˆ†ç±»å·ã€‘TP391.41
ã€è¢«å¼•é¢‘æ¬¡ã€‘4
ã€ä¸‹è½½é¢‘æ¬¡ã€‘610
æ”»è¯»æœŸæˆæžœ

æ‰“å°æœ¬é¡µ

èŠ‚ç‚¹æ–‡çŒ®ä¸ï¼š

èŠ‚ç‚¹æ–‡çŒ®

åŸºäºŽå›¾åƒå†…å®¹çš„æˆäººå›¾åƒæ£€æµ‹

Pornographic Image Detection Based on Image Content

åŸºäºŽå›¾åƒå†…å®¹çš„æˆäººå›¾åƒæ£€æµ‹