节点文献
基于机器学习的微钙化簇检测算法研究
Microcalcification Clusters Detection Based on Machine Learning
【作者】 张新生;
【导师】 高新波;
【作者基本信息】 西安电子科技大学 , 模式识别与智能系统, 2009, 博士
【摘要】 乳腺X线图像中的微钙化簇是早期乳腺癌的一个重要征象,这使得尽早检测微钙化簇并判断其是否有恶化倾向成为实现乳腺癌早期诊断的关键技术之一。然而,乳腺X线图像中只有3%的信息能为人眼所见,大量的信息是人眼不可辨识的,即便是经验丰富的临床医生也很难及时发现其中表征早期乳腺癌的微小钙化点,以致延误病人的最佳治疗时机。为了能够有效地检测出早期乳腺癌的隐匿性病征,更好地辅助医生发现早期乳腺癌,本文采用多分辨分析、子空间学习以及集成学习等机器学习领域最新的理论成果,针对女性的乳腺X线图像中微钙化簇增强和病灶检测问题进行了深入系统的研究,为早期乳腺癌的计算机辅助检测和诊断奠定了基础。论文取得了以下主要研究成果:(1)乳腺区域结构复杂,要在乳腺X线图像中挑选能够完全囊括复杂区域特征的具有代表性的样本很难实现。为了解决该问题,本文提出了一种基于主动学习的微钙化簇区域检测新算法。首先利用方向差分滤波器组对微钙化区域进行增强和特征提取,同时抑制高亮血管和导管等复杂区域的干扰;然后利用基于Bootstrap主动学习方法进行样本的选择以提高分类器的性能;最后在乳腺X线图像中检测钙化簇区域。实验结果表明,该算法在保证较高检出率的同时有效地降低了假阳性率。(2)为了提高微钙化簇检测器的泛化能力和运行效率,本文提出了一种基于子空间学习和双支持向量机(Twin Support Vector Machine, TWSVM)的微钙化簇检测新框架。该框架首先采用简单的伪影去除滤波器和高通滤波器来增强钙化点簇;然后将子空间学习算法嵌入到该框架中对待处理的图像块进行子空间特征提取;最后在特征子空间用TWSVM进行微钙化簇区域的检测。实验结果表明,该微钙化簇检测算法的泛化能力和处理速度得到了显著提升。(3)为了充分利用图像中的空间结构信息,本文将基于向量的双支持向量机学习算法扩展到能够处理张量数据的双支持张量机(Twin Support Tensor Machine, TWSTM),并使之成功应用到微钙化簇的检测中。实验结果表明该算法检测性能优于TWSVM,并且能很好的处理小样本问题。(4)为了将多个微钙化簇检测算法进行集成以获得比单个检测器更好的检测能力,本文设计了一种新的集成学习方法——Bracing,并将之应用到乳腺X线图像的微钙化簇检测。该算法将主动相关反馈嵌入到基学习器训练中以提升其泛化能力;并根据反馈的结果动态更新基学习器的权重。实验结果表明,Bracing算法提高了集成分类器的泛化能力,且在一定程度上避免了过拟合现象。(5)由于子空间学习算法对训练数据中的噪声较为敏感,本文设计了一种基于混合多子空间选择性集成的方法,并将其应用到乳腺的钙化点簇检测。该方法根据子空间学习算法保留分类信息的能力,有选择地对其进行集成。实验结果表明该方法提高了微钙化簇检测算法的性能和稳定性,能更好地适应噪声环境。综上,为了能够有效地检测出早期乳腺癌的隐匿性病征,更好地辅助医生发现早期乳腺癌,本文将机器学习方法在微钙化簇检测方面的应用进行了深入的研究和进一步的发展,所提出的方法能够有效地提高微钙化簇的检出率、降低假阳性率,为基于乳腺X线图像的辅助诊断系统的研究提供了新思路和新方法。
【Abstract】 In digital mammograms, an important sign of the early breast cancer is the existence of microcalcification clusters (MCs). One of the key techniques for early diagnosis of the breast cancer is to detect MCs and to judge whether they are malignant or not in mammograms. However, there is only about 3% information in mammograms, which can be seen with the naked eye. Due to the most details in mammograms cannot been perceived by human eyes, it is even very difficult for an skillful radiologist to find the sign of early breast cancer, i.e., micalcification clusters, as a result missing the best time for treatment. To detect early sign of this disease and to aid doctors to diagnose breast cancer in early stage, we propose several new methods for enhancing and detecting the supicious areas by using some new techniques in machine learning, such as multi-resolution analysis, subspace learning, ensemble learning and so on. The main contributions of this paper are summarized as follows.(1) Because of the complex structure in mammography images, it is very difficult to choose typical training samples, which include the complex structure features, from mammograms. To overcome this difficulty, a new approach to MCs detection is presented based on active machine learning. In the proposed algorithm, firstly the microcalcification region is enhanced with a directional difference filter bank, which effectively extracts the features of MCs and suppresses the blood vessels and mammary duts. Then the active sample selection method based on Bootstrap is employed to select training set. Finally the trained Bayesian classifier can be used to detect MCs in mammogram. The experimental results show that the proposed algorithm reduces false positive rate with keeping the same sensitivity.(2) In order to improve the efficiency and the generalization ability of the MCs detector, a novel framework for MCs detection in mammograms is developed based on subspace learning and twin support vector machine (TWSVM). In the framework, MCs are firstly enhanced by using a simple-but-effective artifact removal filter and a well designed high-pass filter. Thereafter, subspace learning algorithms are embedded into this framework for subspace selection of each image block. Finally, the MCs detection procedure is performed in the feature subspaces. The experimental results show that MCs detection of the generalization ability and processing speed has been significantly increased. (3) For the purpose of making full use of the image spatial structure information, we generalized the vector-based learning algorithm, twin support vector machine (TWSVM) into the tensor-based method, twin support tensor machines (TWSTM) and successfully apply it to MCs detection. The experimental results show that the proposed algorithm achieved better detection performance than the TWSVM-based one, and could also deal the small sample problem better.(4) To get a better performance by ensembling the multiple methods for MCs detection rather than using a single algorithm, we developed a new ensemble learning method, Bracing, which has been applied to MCs detection in mammogram. In this method, when building new base learners the active relevance feedback is embedded to improve its generalization ability. Meanwhile, the weight of each base learner can be dynamically updated by the weighted score of the feedback result. Experimental results demonstrate that the Bracing algorithm could realize great advantages of ensemble classifier in generalization ability and could promise the preventing of overfitting.(5) As the subspace learning algorithm is sensitive to noise in training dataset, we proposed a hybrid subspace selective ensemble (HSSE), and successfully applied it to MCs detection in mammograms. In the algorithm, the subspace learning algorithm will be selectively ensembled according to the ability of preserving the classification information. Experimental results show that the proposed method improved the performance and stability of MCs detection and could be adapt to the noise enviroments better.In summary, to effectively detect the early sign of breast cancer in mammograms and to aid doctors to diagnose breast cancer better in early stage, we deeply studied the machine learning methods and their applications in microcalcification clusters detection. The proposed methods could get satisfactory results on sensitivity and reduce false positive rate, which provide some new ideas and methods for the research and development of computer-aided detection system in the breast cancer detection community.