节点文献

甘肃省气象科学数据共享平台及其应用研究

Study on Gansu Meteorological Scientific Data Sharing Platform and Application

【作者】 陈学君

【导师】 李吉均;

【作者基本信息】 兰州大学 , 人文地理学, 2009, 博士

【摘要】 科学数据共享平台建设可以最大限度地发挥科学数据资源效益,在为政府部门提供决策服务、气象业务自身发展、地学相关领域的科学研究、地方国民经济建设等领域都具有很强的科学意义和重要的现实意义。基于元数据技术构建的科学数据共享平台,充分展示了数据资源与数据共享的融合。平台有着完整的构建策略,数据资源分布存放而又有序存取,满足多用户并发访问及流量疏导,极大地提高了共享数据资源的可获取性;同时,遵循数据本身的内在关联,强调数据之间的联系,为平台各级各类用户提供翔实可靠的数据。本文以甘肃省气象科学数据为研究对象,构建了甘肃省气象科学数据共享平台,取得如下成果:遵循区域地带性与非地带性相结合、综合性和主导因素相结合的原则,应用数据质量控制技术首次研制了一批西北气象科学数据集,主要包括西北地区地面气象、高空气象、气象灾害、气象辐射、历史气候代用资料等五大类资料,共31个数据集;平台构建中采用了元数据技术、GIS等技术,建立了以气象专题数据集为基本数据,以商用分布式数据库为基础平台,为用户提供了结构化和非结构化数据的透明访问存取,通过元数据系统和根据统一元数据标准建立的元数据库,系统实现了共享网不同分节点的元数据发布和元数据搜索导航机制。将元数据技术应用于成熟应用系统中,这在国内是不多见的;平台开发采用了分布式的数据存储,处理了数据分散存储而集中存取的问题,提出了以统一的元数据为数据的宏观描述,通过计算机科学中分布式系统的理论和方法,建立各个系统分节点协同机制;平台开发的技术思路和应用系统,已经向湖北、江西、新疆、宁夏等省气象部门推广应用,也可广泛应用于科技基础条件平台建设中的地学数据资源共享系统建设,为甘肃省地学领域的科学数据共享起到示范和带动作用。文中利用共享平台提供的基础数据,作了多种相关研究:首先进行甘肃省气候区划研究;接着提出了一种空间插值方法(ISTDW)并将其应用到雷达数据的数据的异常值处理中;同时,深入研究了多元雷达时间序列相似性匹配问题并给出了一种相似性匹配算法;最后,应用人工智能相关算法(仿生优化算法(PSO(粒子群)算法、鱼群算法(AFSA))和预测算法(BP神经网络、最小二乘支持向量机))进行应用研究,取得如下结果:提出一种甘肃省气候区划方案;设计的ETL过程实现了雷达测量数据和地面常规观测数据的时空粒度一致性,应用DIRE(动态增量规则引擎)实现异常值检测及ISTDW插值方法进行数据清洗,三种措施确保了雷达测量数据的数据质量,为气象数据仓库的建设提供了“清洁”的数据;提出的多元雷达时间序列相似性匹配算法(SBMDTSM)综合考虑了异常数据、序列形变、季节等方面对距离度量的影响,给出了形式化的距离度量模型,验证了SBMDTSM相似性匹配算法在漏报率、多报率和准确性方面均优于APCA和DWT算法,较好地解决了多元雷达时间序列相似性搜索这一研究难题;利用人工鱼群算法具有快速跟踪变化能力及不容易陷入局部最小值的优点(即具有很强的跳出局部极值的能力)来自动寻找最优的最小二乘支持向量机的参数;提出一种新颖的算法——粒子群与鱼群混合的新算法,取名为AFSA-PSO-parallel-hybrid evolutionary(APPHE) algorithm,用粒子群与鱼群混合的新算法来训练BP神经网络,提出的新算法具有较好的稳定性,比传统的BP神经网络更稳定,这对于分类问题尤为重要。

【Abstract】 The construction of scientific data sharing platform has a strong scientificsignificance and practical significance to maximize the effectiveness of data resourcesand provide to local governments in the decision-making services, meteorologicaloperation self-development, scientific research in earth science-related fields, localeconomic construction, and so on. The platform has a complete strategy for datastorage, which greatly enhancing the shared data resources accessibility by thecombination of distribution and orderly access to meet the multi-user concurrentaccess and traffic flow. At the same time, the platform follow data own internalrelationship to stress data links between the various types data and provide real datareliability for users at all levels.Based on meteorological data in Gansu Province as an example, the platformachieved the following results: Following regional and non-regional areas combined,integrated and dominant principle of combining factors, developed a number ofmeteorological data sets, including the northwest ground, agrological, meteorologicaldisasters, air radiation, the history climate proxy data. All five categories, contains atotal of 31 data sets, apply data quality control technology to ensure data quality. Theplatform largely used metadata technology and GIS technology to establish a basicplatform, which based on meteorological thematic data sets and built onbusiness-based distributed database platforms, to provide users with transparentaccess for structured data access and unstructured data access. Based on unifiedmetadata standard, the metadata system established by the platform provided anavigation mechanism for different sub-node and benefits the deployment ofmeta-data, which is rarely reported in China. The Platform used distributed datastorage, which bridge the contradictions of data distributed storage and concentrateaccess by proposed to unify metadata description for the Entity Data and established a coordination mechanism of various systems sub-node. The Technology and methodsof platform development has been introduced to the Hubei, Jiangxi, Xinjiang, Ningxiaand other provinces to promote the application of meteorological services. Theplatform is an important part of the scientific and technological base platform ofGansu Province and will provide rich content and reliable data quality for theearth-related fields of scientific research, government decision-making meteorologicalservices, public weather services, the national economic construction and localresearch and project construction. The paper’s results, developed platform andsystems can be widely used in the construction of other resource sharing system forGansu Province and can play a leading role model.In this paper, a variety of related research is held by the sharing platform: First ofall, climatic regionalization of Gansu Province is carried out; and then a spatialinterpolation method (ISTDW) is applied to radar data for dealing with abnormal dataand absence data; At the same time, in-depth study of the multi-radar time-seriessimilarity matching problem and gives a similarity matching algorithm; Finally, someartificial intelligence algorithms, such as bionic optimization algorithm (PSO (PSO))algorithm, fish-swarm algorithm (AFSA) and the prediction algorithm (BP neuralnetwork, least squares support vector machine)), is studied. By the research, thefollowing results are made: a climatic regionalization program in Gansu Province isprovided; a well-designed ETL processes for space-time particle size consistency ofradar data and ground observation data is held out and perform good; the DIRE(dynamic incremental rule engine) to detect abnormal data and the ISTDWinterpolation method for data cleansing ensured the data for meteorological datawarehouse is "clean"; the multi-radar time-series similarity matching algorithm(SBMDTSM), deeply considered the abnormal data, sequence of deformation and theseason in terms of the impact of distance measure, given a formal model of thedistance measure and has been verified that the similarity SBMDTSM matchingalgorithm in false dismissals rate, the false alarms rate and accuracy rate are betterthan APCA and DWT algorithm. Artificial Fish Swarm Algorithm (AFSA) isproposed to choose the parameters of least squares support vector machine (LS-SVM) automatically in time series prediction. A novel hybrid evolutionary algorithm basedon AFSA and PSO, also referred to as AFSA-PSO-parallel-hybrid evolutionary(APPHE) algorithm, has been used in FNN training. Compared to FNN trained byLMBP algorithm, FNN training by the novel hybrid evolutionary algorithm showsatisfactory performance, converges quickly towards the optimal position, convergentaccuracy, high stability and can avoid overfitting in some extent. FNN training bythe novel method has been testified by using in Iris data classification and the resultsare much more accurate and stable than by Levenberg-Marquardt back-propagationalgorithm.

  • 【网络出版投稿人】 兰州大学
  • 【网络出版年期】2009年 11期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络