节点文献

胃肠肿瘤标志物诊断大肠癌之检验医学实践

Gastrointestinal Neoplasm Maker to Diagnose Carcinoma of Large Intestine for Clinic Practice of Laboratory Medicine

【作者】 郑旅芳

【导师】 王开正;

【作者基本信息】 泸州医学院 , 内科学, 2010, 硕士

【摘要】 数据挖掘(Data Mining),就是从大量数据中获取有效的、新颖的、潜在有用的、最终可理解的模式的非平凡过程。数据挖掘的广义观点:数据挖掘就是从存放在数据库,数据仓库或其他信息库中的大量的数据中“挖掘”有趣知识的过程。数据挖掘,又称为数据库中知识发现(Knowledge Discovery in Database, KDD),也有人把数据挖掘视为数据库中知识发现过程的一个基本步骤。知识发现过程以下步骤组成:(1)数据清理,(2)数据集成,(3)数据选择,(4)数据变换,(5)数据挖掘,(6)模式评估,(7)知识表示。数据挖掘可以与用户或知识库交互。数据挖掘利用了来自如下一些领域的思想:(1)来自统计学的抽样、估计和假设检验,(2)人工智能、模式识别和机器学习的搜索算法、建模技术和学习理论。数据挖掘也迅速地接纳了来自其他领域的思想,这些领域包括最优化、进化计算、信息论、信号处理、可视化和信息检索。一些其他领域也起到重要的支撑作用。特别地,需要数据库系统提供有效的存储、索引和查询处理支持。源于高性能(并行)计算的技术在处理海量数据集方面常常是重要的。分布式技术也能帮助处理海量数据,并且当数据不能集中到一起处理时更是至关重要。信息技术和生命科学被认为是21世纪的标志性学科。本世纪的人类社会被誉为“信息社会”,信息化,网络化,高科技化已成为社会发展的基本特征。特别是20世纪90年代Internet等现代信息技术的飞速发展和人类基因组计划的完成,使人们面临的不仅仅是一个庞大的信息数据库,而是浩瀚的信息海洋。正是生物技术和信息技术的有机结合,催化一个新的学科——检验医学信息学的诞生。医学是一门与试验和信息结合非常紧密的科学,检验医学更不例外。完成一个诊断或治疗的过程,也就是信息的获取,处理和利用的过程。可以说,更广泛地获取信息,更科学地分析信息,更合理地利用信息决定了医疗质量和医疗水平,而计算机技术在其中起到非常重要的作用。也正是由于计算机技术使医学检验发生革命性变化,改变了医学检验的学习理念和工作方式。随着信息技术的发展,主要是基因信息库和蛋白质信息的利用,高度集成的试验室信息系统(Laboratory Information System, LIS)和医院信息系统(Hospital Information System, HIS)的建立,临床医学信息学和疾病信息学的高速发展,医学检验教育的方向适应新的形式,在全体检验同仁的共同努力,医学检验也就很快发展成为不仅仅为临床提供实验数据,而且为临床诊疗决策提供重要信息的检验医学。研究目的:将有限的检验信息提炼为高效的诊治信息,从技术层面探索检验医学的临床实践新途径。研究方法:以CA72-4,CA199和CEA三项血清标志物检验诊断大肠癌为例,依托实验信息系统(LIS)与医院信息系统(HIS)的数据信息平台,利用人工神经网络(Artificial Neural Network, ANN)为数据挖掘工具和SPSS统计软件构建受试者工作特征曲线(Receiver Operating Characteristic, ROC)数据集,以验后概率解释每一份胃肠肿瘤标志物检验报告。研究结果:纳入研究的1206份胃肠道肿瘤标志物检验标本中大肠癌占11.5%;构建了CA199,CA72-4,和CEA检验筛查和诊断大肠癌的ROC数据集;大肠癌组三项血清标志物浓度均显著高于健康对照组和其他疾病组(<0.01);CA199,CA72-4,CEA和人工神经网络诊断模型预测值筛查大肠癌的ROC曲线下面积分别是0.624,0.692,0.721和0.785。而诊断大肠癌的ROC曲线下面积分别是:0.607,0.762,0.687和0.795。赋予验后概率的检验报告客观地提供了检测结果的参考价值。研究结论:人工神经网络(Artificial Neural Network, ANN)模型在多项检验项目分析具有更高的诊断效率,构建ROC数据集并赋予验后概率的检验报告是检验医学临床实践切实可行的新途径。

【Abstract】 Data mining, it is processes that like taking effective, original, and potential serviceable the last comprehensible model form mass data. The broad definition of data mining is a process that to dig interesting intellective from the mass data to deposit database, data warehouse and other information bank. The other name is regard the data mining as a basic step that knowledge discovery from the database, knowledge discovery’s process have seven steps to make up: (1) data purging (2) data integration (3) data selection (4) data covert (5) data mining (6) mode evaluation (7) knowledge representation. Data mining can together with consumer or knowledge base alternately.Data mining to utilize some areas’ thoughts as follow:(1) from statistician’s sampling estimation and hypothesis testing (2) artificial intelligence, pattern recognition and machine learning’s search algorithm, modeling technique and theory of learning. Data mining accept fleetly from other areas thoughts. These areas contain optimization, evolutionary computing, information theory, signal processing, visualization and information retrieval. Some other areas are playing important prop up support role. Extraordinarily, to demand database provide effective memory, indexes and query processing support. Originate from high performance calculating technique which to deal with mass data is very important. Distributed technology can help to deal with mass data, and when the data can’t concentration together to deal with that is more important.Information technology and Bioscience are known as 21 century’s signal subject. This century’s human society is reputation "information society", information-based, networking, high-technology have already to become society development’s fundamental characteristics. Especially 20 century 90 eras, Such as Internet the modern time information technology’s progress at full speed and human genome project accomplish. People to be faced with not only are enormous information data, but also vast information oceans. Just to connect Biotechnology with information technology, Catalysis is a new branch of study Laboratory Medical Informatics has birth.Medicine is a science that a trial and the information to integrate very tightly, Laboratory Medicine is not exception. Accomplish a diagnosis or therapeutics’process i.e. information acquisition, Handling and utilization’s process. By means of obtain in information more widely, analyze information more science, and utilize information more reasonably to decide the Medical treatment quality and Medical treatment level, and computer technology play a very important role in it. Because of computer technology make Laboratory Medicine to take place revolution change. Changed Laboratory Medicine’s learning philosophy and work style. With the information technology develops, Principal gene library and protein information utilize. Height integrated laboratory information system and Hospital Information System are built. Clinical Medicine informatics and disease informatics in high speed developed, the direction of Laboratory Medicine educative to adapt a new style. In all Laboratory colleagues strive in common. Laboratory Medicine also developed quickly that not only to offer experimental data for clinical, but also to offer very important information for clinical to diagnose and treatment.To make the limited laboratory information extraction for efficient diagnosis and treatment of information, and to explore a clinical new way for laboratory medicine from a technical level.CA72-4, CA199, and CEA, the three blood serum makers are used to diagnose carcinoma of large intestine, for example, relying on laboratory information system (LIS) and hospital information system (HIS) data information platform, artificial neural network (ANN) is used for data digging tools,and by means of SPSS statistical software to build ROC data sets, Depend on posterior probability to comment gastrointestinal tumor markers in each inspection reports.In 1206 samples, gastrointestinal tumor marker test specimens of colorectal cancer is accounted for 11.25%, to build CA199, CA72-4 and CEA testing,seeking and diagnosis carcinoma of large intestine of ROC data sets; the three carcinoma of large intestine serum markers’concentrations are significantly higher than the healthy control group and the other disease groups (P<0.01); CA199, CA72-4, CEA, and artificial neural network diagnostic model for carcinoma of large intestine screening predictive value of area under the ROC curve are 0.624、0.692、0.721and 0.785,while the diagnosis of carcinoma of large intestine in the area under the ROC curve are 0.607、0.762、0.687 and 0.795.Respectively, survey report assigned test posterior probability objectively provides a reference value.ANN model has a higher diagnostic efficiency analysis in a number of test items, to build ROC data sets,and a inspection report satisfied with the ROC data sets,which has been given the posterior probability,is a feasible new way in clinical practice of laboratory medicine.

  • 【网络出版投稿人】 泸州医学院
  • 【网络出版年期】2011年 04期
  • 【分类号】R735.34
  • 【下载频次】190
节点文献中: 

本文链接的文献网络图示:

本文的引文网络