

Researches on Microscopic Image Recognitionof Harmful Algae Bloomsin Coastalchina

【作者】 郭春锋

【导师】 姬光荣;

【作者基本信息】 中国海洋大学 , 计算机应用技术, 2011, 博士

【摘要】 近年来,我国沿海赤潮持续高频次发生,已严重影响到居民的饮水安全、水产养殖、水体景观价值等方面,造成了巨大的经济损失。我国各级政府部门和科研机构对有害赤潮进行快速监测、早期预警的需求越来越强烈。在此情形下,建设一个数字化和标准化的、具有“有害赤潮综合信息检索、藻种和藻毒素标准品供应、鉴定与检测标准技术提供、远程诊断服务”等功能的有害赤潮诊断标准技术平台,就成为国家的迫切需求。国际上虽已有类似设想,但尚未建成如此完整而独具特色的技术支持平台。本文通过分析我国沿海有害赤潮发生情况,给出常见有害赤潮藻名录,采集藻种不同生长时期、不同地理株系、不同角度的生物形态学信息及多视点图像,得出生物形态学分类判据,并汇集藻种分子生物学、色素和光谱信息,建立中国海常见有害赤潮藻综合数据库;同时,集成各种赤潮藻分析方法的鉴定与检测标准技术体系和处理方法,构建一个基于Web的有害赤潮生物诊断技术平台;以传统的生物形态学分类为依据,深入分析藻种细节特征和形状特征的明显差异,基于图像分析、统计学习和模式识别技术,构建赤潮藻显微图像自动诊断识别系统。本文主要工作及创新如下:1、中国海常见赤潮藻的海洋生物学信息和分类研究结合中国沿海近年来赤潮发生的情况,给出了本课题涉及的41种藻种名录,研究了这41种赤潮藻的生物形态学特征,初步阐述了赤潮藻生物形态分类思想,从而为有害赤潮藻数据库设计和显微图像识别系统的研究奠定基础。2、中国海常见有害赤潮藻数据库设计与实现采集藻种不同生长时期、不同地理分布的海洋生物学信息及不同角度的多视点图像,汇集本项目研究获得的赤潮藻形态学、分子生物学、色素与光谱数据,形成“有害赤潮藻综合信息库”。同时,汇集适用于不同时空尺度与精度的赤潮藻检测技术,建立赤潮藻鉴定与定量检测技术体系库。结合有害赤潮生物诊断技术平台的应用需求,设计了用于诊断识别的处理方法库和识别数据库,从而形成了完整的有害赤潮藻综合数据库。3、“有害赤潮生物诊断技术平台”建设依托中国海常见有害赤潮藻综合数据库,搭建“有害赤潮生物诊断技术平台”,主要包括赤潮藻综合信息库、赤潮藻鉴定与定量检测技术体系、在线诊断系统和有害赤潮研究与监测材料供应虚拟中心四部分。可满足数据库输入与查询的前端交互、赤潮研究相关资料的动态发布、诊断鉴定技术接口、用户管理等需要。“赤潮在线诊断”集成了项目开发的多项功能模块,包括人机互动检索、显微图像识别、化学分类、三维荧光光谱识别,可通过Internet提供在线远程服务。平台采用J2EE体系架构,融合了当前主流的Struts、Spring、Hibernate等Web应用框架,实现了适用于本项目需求的系统框架。系统设计采用MVC模式,将表现逻辑、业务逻辑、数据库调用逻辑分开,具有很好的独立性、可移植性和可扩展性。4、赤潮藻显微图像自动识别技术研究通过对有害赤潮藻类细胞生物形态学细节特征和形状特征的分析,建立了赤潮藻显微图像自动分类体系。分别对藻类细胞三种细节特征(有无角毛、横纵沟、尖顶刺)进行有效的自动提取,作为显微图像自动分类的重要判据,进而设计三级两类分类器,建立树状判别体系,将大样本集有效划分为小样本集,并针对不同的小样本集进行相应的自动分类,然后进一步提取全局形状特征,从而得出识别结果,多级分类器的设计思想同时也提高了识别准确率。分类器Ⅰ,根据细胞有无角毛,对上传图像进行第一级分类。对于赤潮藻显微图像,首先采用基于灰度方向角模型的细胞目标提取算法;针对角毛藻细胞分叉较多,进行基于形态学细化的骨架提取,得到藻种细胞骨架的细节特征,将骨架的节点和端点数目多少作为是否为角毛藻的判据。对角毛藻类进行分类识别,得到诊断结果;对无角毛类藻使用分类器Ⅱ继续判别。分类器Ⅱ,针对无角毛类藻,根据有无横纵沟进行第二级分类。对于无角毛类藻种,首先采用基于自动化阈值的最大轮廓细胞目标提取算法;针对显微图像中横纵沟区域与细胞主体景深不同,用基于约束标记分水岭变换进行横纵沟提取,获得藻种细胞的横纵沟细节描述,计算所提取横纵沟与细胞的面积之比和横纵沟区域质心到细胞质心距离与细胞最小外接矩形的长之比,将这两个比值的大小作为藻种有无横纵沟的判据。对无角毛有横纵沟类藻种进行分类识别,得到诊断结果;对无角毛无横纵沟类藻种使用分类器Ⅲ继续判别。分类器Ⅲ,针对无角毛无横纵沟类藻,根据有无尖顶刺进行第三级分类。针对显微图像中尖顶刺较小并与细胞主体边缘凸出相连,采用基于最佳结构元的尖顶刺提取方法,获得藻种细胞的尖顶刺细节描述,根据有无尖顶刺分为两类。然后分别进行分类识别,得到诊断结果。分类识别方面,主要结合不同赤潮藻的生物形态学特征,在细胞目标提取的基础上进行不变矩和形状因子特征的提取和描述,形成特征样本集;采用支持向量机对特征样本集进行训练,得到识别模型库;将待识别样本特征数据与相应类别识别模型库进行模式识别,得到最终诊断结果。采用上述分类器思想对41种赤潮藻种、共3600幅显微图像(其中训练样本2600幅,测试样本1000幅)进行识别测试,平均识别率为83.27%,去掉三级分类器的识别误差,实际识别率平均值为82.05%,达到了较好的识别效果。

【Abstract】 In recent years, increasingly high-frequency occurrence of red tides in coastal Chinahas seriously affected the safety of drinking water to residents, aquaculture, water landscape value and other aspects, which has caused enormous economic losses every year. The increasingly strong demands of rapid monitoring on the dynamics changes of phytoplankton communities in water and the demands of early warning forecasting of freshwater cyanobacteria blooms, marine diatoms and dinoflagellate red tide have being grown by the government departments and research institutions.In this case, the establishment of a digital and standardized technology platform with a harmful red tides integrated information retrieval, algal species and algal toxins standard supply, identification and detection of standard technology,remote diagnostic services and other features of the harmful red tide diagnostic standard technology platform, has become the country’s pressing needs. Although some international institutions have similar ideas,but such a complete and unique technical support platform is still not built.According to the analysis of HAB occurrence in coastal China, the species list of researchHAB is proposed. Based on the bio-morphological information and multi-viewpoints images about algal species collected in different growth stages, different geographic lines, bio-morphological classification criteria of different angles is obtained. After collecting of algae species and molecular biology, pigment and spectral information, the comprehensive database of HAB of China coastal waters is established. Based on the traditional biological morphological taxonomy, this paper studies a variety of red tide algal identification and detection standards technology system of analytical methods constructed a Web-based harmful red tide biological diagnostic technology platform; and builds microscopic images automatic diagnosis and recognition system based on image analysis, statistical learning and pattern recognition technology with the traditional morphological classification asthe basis,in-depth analysis of algal species characteristics and details of significant differences in shape features.The main work and innovation are as follows:1. Research on the marine biology information and classification of HAB in coastal China. According to the occurrence information of red tide in China’s coastal areas in recent years, identify the issues involved in41species of algal and their ecological taxonomic characteristics and put forward to an idea on ecological taxonomic classification so as to lay a solid foundation on harmful algae database design and microscopic images recognition system.2. Design and implementation of HAB comprehensive database. Form the HAB comprehensive information database by Collecting in different growth stages, and different geographic distribution,marine biology information and different perspectives of the multi-viewpoint images,and brings together the project research information data of the HAB in ecological taxonomic, molecular biology, pigment and optical,etc.Design and create "HAB identification and quantitative detection technology system"database, acquisition the detection technology applied in different spatial and temporal scales and precision.Combined with the application requirements of HAB biological diagnostic technology platform, we design the recognition processing method and identification database for diagnosis, thereby forming a complete comprehensive database of harmful algae.3. Establish HAB biological diagnostic technology platform.The platform includes comprehensive information database, the identification and quantitative detection system, on-line diagnostic system, the virtual center for research and monitoring material supply,which can meet the requests for the input and query platform, for the dynamic releasing relevant information of the research, for the diagnosis and identification of technical interfaces and for the user management."The on-line diagnostic system" integrates a number of modules for project development, including the interactive retrieval between human-computer, microscopic image recognition, chemical classification, three-dimensional fluorescence spectrum recognition, which can provide remote services through the Internet.The platform adopted J2EE architecture and integrated the current mainstream of Struts, Spring, Hibernate and other Web application framework to realize the requests of the system framework of the project. The system design adopted MVC pattern to separate the presentation logic, business logic, database adjustment logic,has very good independence, portability and scalability.4.Research onautomatic identification of HAB microscopic image. Through analyzing of the morphologicaldetail features and shape features of the HAB cells, the automatic classification system of the microscopic images is founded.Effective automaticalextraction separated from three detail features of algae cells (with or without seta,cingulum or sulcus, spine),the important judgement criterion of automatic classification of microscopic images is obtained.Then design three levels of two types of classifiers, and establish three identification system to divide a large sample set into small set. To classifythe different small sample set withcorresponding automatic classification, then make a further extraction of global shape characteristics,so as to get the recognition results. In this way classifier design ideas also improve the recognition accuracy rate.Classifier Ⅰ, according to cells with the seta or not, the first level of classification begins with uploading images. Firstly, the microscopic images of target algaes are extracted out based on gray-scale model algorithm, and according Chaetoceros have more bifurcations, the structures are refined based on morphology to get the details of features of the cytoskeleton of the algae species as the judgement criterion of the chaetoceros. Then the first classification chaetoceros can be diagnosised and return results; the other algaeswithout seta will be going on by using the classifier Ⅱ.Classifier Ⅱ, the algae without seta classification continue with the second level according to the cingulum or sulcus. The maximum contour of target cells are extracted out based on the automatic threshold valuesalgorithm, and then the cingulum or sulcus are extracted based upon the watershed transformwith the constraint mark to get the detailed description of algae cells. To calculate the area ratio between the cingulum and cell and the diantance ratio between the cingulum centroid to cell’s and the length of minimum exterior rectangle of the cell as judgment criterion,then the images with the cingulum or sulcus are classified and can be identified; the images without cingulum or sulcus will be continued by using the classifier Ⅲ.Classifier Ⅲ, regarding the algae without cingulum or sulcus, the classification of the third level begin upon algea withor withoutthe spine by adopting the extraction method based on the best structure element to get the detailed description of the spine of the algae cells anddivide into two categories,then make classification andrecognition.On the classification and recognition, mainly combined with different features of bio-morphological of algae, invariant moments and shape factor features are extracted and described based on the extraction of target cells to form the feature samples; then these samples are trained by using support vector machine to get recognition model database; then identify the sample characteristics with the corresponding model database for pattern recognition; finally obtained good recognition results. We have identification test on41kinds of red tide algae species, a total of3600microscopic images (the training sample2600pieces and test sample1000pieces)by above classifier thought, the average recognition rate is83.27%, removed the three level classifier recognition error, the actual recognition rate average is82.05%, achieved better recognition results.
