节点文献

细胞图像的分割、纹理提取及识别方法研究

Research on Segmentation, Texture Extraction and Classification Methods for Cell Images

【作者】 李宽

【导师】 殷建平;

【作者基本信息】 国防科学技术大学 , 计算机科学与技术, 2012, 博士

【摘要】 本文对细胞图像的分割、纹理提取及识别中的关键技术进行了深入研究,主要包括单细胞图像中细胞核与细胞质边缘的精确提取、细胞图像的纹理提取及细胞图像多特征融合分类。此外,本文还尝试改进极限学习机(Extreme LearningMachine,ELM)分类器处理细胞图像分类中常存在的不平衡数据问题。主要研究成果如下:1.本文提出一种基于射线梯度的GVF Snake主动轮廓模型,用以从单细胞图像中精确定位细胞核与细胞质的边缘。GVF Snake主动轮廓模型是种应用广泛的目标边缘跟踪算法。在细胞图像中,细胞质与背景间的界限相对模糊、细胞核与细胞质边缘附近常分散有干扰性的血细胞及炎症细胞、染色浓度不均匀,这些都容易将GVF Snake轮廓吸附到错误的位置。为解决这些问题,本文结合细胞图像中染色浓度的分布特点,提出了如下改进:(1)充分利用梯度的方向性信息,提出了基于射线梯度的边缘图计算思路,相比传统梯度边缘图,能有效提取模糊边缘;(2)提出了基于栈的灰度差补偿算法,结合正灰度差抑制,能有效克服由噪声、血细胞及炎症细胞等引起的虚假梯度的影响。Herlev宫颈细胞数据集上的实验验证了这种方法的有效性。2.本文提出一种基于Gabor系数分块统计的细胞图像纹理提取方法。纹理特征在细胞图像状态识别中发挥着重要作用。针对Gabor滤波后图像特征维数高、数据量大的问题,提出如下改进:将图像分成若干子块,计算各子块中特定尺度和方向Gabor系数的均值和方差,组成各块的特征矢量。按行列顺序将各块特征矢量拼接组成整幅图像的特征矢量。依据投票规则将多个两类分类器组合成多类分类器,并在两类分类器设计时自适应地挑选出最具分辨能力的最优特征子集。在Yale人脸数据库上进行实验并与其他方法进行比较,讨论了分块大小和最优特征子集维数对分类识别率的影响。HEp-2细胞染色型别分类实验验证了本方法的有效性。3.本文提出一种基于局部最大熵多值模式的细胞图像纹理提取方法。局部多值模式(Local Multiple Pattern,LMP)是对局部二值模式(Local Binary Pattern,LBP)的改进,是种高效的图像纹理特征提取方法。但局部多值模式需手工设定多个阈值,且其特征维数过高。为解决这两个问题,提出如下改进:首先,统计每幅图像的灰度差直方图,依据最大熵原理在此直方图基础上自动计算各阈值,以保留最多的不确定性及分类信息。然后,使用平面切分组合编码机制取代原有的编码机制,将特征维度控制在多项式范围内。在Outex与KTH-TIPS纹理数据集上与局部二值模式及局部多值模式进行了全方位比较,验证了本方法的有效性。HEp-2细胞染色型别分类实验也取得了满意的效果。4.本文提出一种基于后验概率的细胞图像多特征融合分类方法。在细胞图像分类中,需对不同方法提取的多种特征进行有效的融合。首先,使用每种特征训练一个后验概率分类器,将多个分类器的概率输出加权求和,构建集成分类器。其中,各分类器的权重根据其在训练集上的表现确定,在训练集上表现较好的分类器将获得较大的权重。而后,将集成分类器用作元分类器嵌入到AdaBoost.M1分类器集成框架中,提升分类效果。HEp-2细胞染色型别分类实验表明:该方法能有效融合多种图像特征,显著提升分类性能。5.本文提出两种改进ELM分类器处理不平衡数据分类问题的方法。对临床数据分类时,受限于各种条件,常出现样本集分布不平衡的情况。使用标准分类方法对不平衡数据处理时常无法取得满意的结果。不平衡数据的分类问题已得到广泛的研究关注,但尚无相关研究围绕ELM分类器展开。本文提出如下改进:首先,将代价敏感信息引入ELM,为不同类样本赋予不同的错分权重,使用遗传算法搜索最优权重集合,提出代价敏感ELM;然后,将代价敏感ELM嵌入代价敏感AdaBoost.M1分类器集成框架,提出代价敏感组合ELM。在19个医学相关不平衡数据集上的实验验证了所提出的两种方法的有效性。

【Abstract】 This thesis is focused on some issues related to segmentation,texture extraction andclassification of cell images. These issues mainly include a method to accurately extractboth the nucleus and cytoplasm, two new texture extraction methods, as well as a fusionmethod of different kinds of features. In addition, using ELM to deal with the imbalanceddataset classification problem, which is common in medical data, is also discussed in thisthesis. The main contributes can be exhibited by the following aspects:1. A radiating GVF Snake (RGVF) model is proposed aiming at accurate extrac-tion of both the nucleus and cytoplasm from a single-cell image. GVF Snake model isa widely used contour tracking method in image processing. However, when used toextract the nuclei and cytoplasm from cell images, GVF Snake may be easily absorbedto wrong positions due to the fact that (1) the boundaries between the cytoplasm and thebackground are oftenquite obscure;(2) alot of inferences exist nearthe edgeof the nucleiand the cytoplasm, including inflammatory cells, blood cells and other noises. To solvethese problems, RGVF involves a new edge map computation method and a stack-basedrefinement, and is thus robust to contaminations and can effectively locate the obscureboundaries. The boundaries can also be correctly traced even if there are interferencesnear the cytoplasm and nucleus regions. Experiments performed on the Herlev dataset,which contains917images show the effectiveness of the proposed algorithm.2. A novel texture extraction method based on Gabor filters is proposed. Texturefeatures play important roles in cell classifications. The cell image is first decomposedby convolving with multi-scale and multi-orientation Gabor filters, then separated intoseveral blocks. The Block Feature Vector (BFV) can be obtained through statistical tech-niques. The Total Feature Vector (TFV) of the whole image is then constructed by conju-gating the BFVs in row column order. In the classification stage, a robust classificationmethod which performs multi-class classification is built based on many two-class clas-sifiers using voting mechanism. Before each two-class classifier, a feature extractionmodule adaptively selects the most important features. The results compared with thepublished results on Yale face database verify the validity of the proposed method. Thestaining pattern classification results on HEp-2cell dataset also prove that the proposemethod is effective. 3. A novel texture extraction method named Maximum Entropy based Local Mul-tiple Pattern (MELMP) is proposed. Local multiple patterns(LMP) has been proved tobe an efficient and robust texture extraction method. However, the thresholds have to beset manually and the the feature dimension is quite high in LMP. To solve these prob-lems, a maximum entropy based thresholding scheme, which computes the thresholds bydividing the intensity difference histogram of an image equally, is adopted, and the split-concatenate encoding is used to form shorter and more effective feature vectors. Exper-imental results on four test suits with an SVM classifier show that the proposed methodachieves overall better performances than both LBP and LMP in texture classification.The staining pattern classification results on HEp-2cell dataset are also very satisfying.4. A feature fusion framework based on posteriori probability classifier and Ad-aBoost.M1framework is introduced. How to fuse different features together is quiteimportant in achieving better cell image classification results. In this thesis:(1) withineach boosting round, several posterior probability classifiers are trained corresponding todifferent descriptors, and then combined to an integrated classifier;(2) AdaBoost.M1ismodifiedtoenhancetheperformanceoftheintegratedclassifiers. ExperimentalresultsonHEp-2cell dataset show the proposed method is effective and can significantly improvethe classification accuracy.5. Two strategies to deal with imbalanced classification are proposed, namely cost-sensitive ELM (CS-ELM) and ELM based cost-sensitive AdaBoost (ELM-AdaCx). First,cost-sensitive information is introduced into the training process of ELM to form CS-ELM. A genetic algorithm (GA) is utilized to find the optimal weights. Second, the pro-posed CS-ELM is utilized as the meta classifier and embedded into a cost-sensitive Ad-aBoost.M1frameworktoformELM-AdaCx. Experimentalresultson19datasetsfromtheKEEL repository show that the proposed strategies could achieve more balanced resultsthan the basic ELM.

节点文献中: