节点文献

互联网信息资源整合研究

On the Information Integration of the WWW

【作者】 俞方桦

【导师】 陈家训;

【作者基本信息】 东华大学 , 控制理论与控制工程, 2001, 博士

【摘要】 互联网信息资源整合(Web Integration)是一门涉及面广、综合性强的新兴研究领域,它与数据库、人工智能、信息系统等学科有着密切的关系,同时,也为这些学科领域带来了新的研究内容。尽管有关Web信息访问和数据整合的研究沿着不同的方向、基于不同学科的方法已经开展一段时期,但是目前还未形成一个系统的方法和结论、仍存在一些没能解决的问题。本文指出数据模型、知识表示和处理、实用性和自动化处理能力是WI系统的关键问题。在此基础上对WI进行全面的研究,获得以下研究成果: (1)以本体模型(ontology)为指导的WI系统方法和结构。 (2)演绎的带有半结构化特征的对象数据模型DOMSF,作为WI系统的统一数据模型。该模型是能适应WI系统要求的数据模型,具有丰富的数据类型和灵活性。DOMSF又是有演绎能力的对象模型,演绎规则的引入使之具有更强大的功能和表示能力,能描述除继承关系之外的更多对象之间的关系。 (3)对本体方法论的语义进行扩充,将其与DOMSF数据模型有机地结合在一起,并在此基础上提出本体模型表示语言ORL。该语言具有丰富的表达能力,并且支持语法和语义的可互操作性。 (4)基于对象的观点提出动态Web的层次访问模型,将页面看作是模板,以对象网络的观点来处理Web上的数据。在此基础之上提出源描述语言TDL,它结合文档结构和文本模式的描述,能很好地描述动态Web页面中的数据模式,能更好地适应页面的频繁变化。 (5)基于关系数据库提出一种基于本体模型的信息查询方法和优化算法。 (6)实现了一个WI系统的工具软件集-WISK(Web Integration Service Kits),并给出一个使用该软件集开发的实际应用的例子。 Web信息资源整合具有广阔的应用前景,在电子商务、智能信息检索、数字化图书馆、Web数据挖掘、企业信息门户等诸多应用领域都可大展鸿图。可以毫不夸张地说,信息资源整合将是新一代Web及电子化服务中的主导技术之一。

【Abstract】 The Information Integration of the WWW, shortly as Web Integration (WI), is a broad, synthetic and novel research area, which has deep relationships with areas such as database, AI, information system and so on. Also, WI has brought many new problems to those subjects. Although there have been some studies on WI through various directions and methods for some period, a systematical method or conclusion for WI is still unavailable. Furthermore, there still exist some problems left for resolving. In this thesis, it is indicated that the unified data model, knowledge representation and processing, and practicability and automaticity are three key points of any WI system. Based on the points, this thesis studied WI as a whole, and obtained the following results:(1) Ontology-guided techniques and architecture of WI system;(2) Deductive Object Model with Semi-structured Features(DOMSF) as the unified data model for WI system. DOMSF can fit well with the requirements of WI environment, with plenty data types and high flexibility. DOMSF is capable of rule deduction, which makes it much powerful and expressive and capable of representing complex relations between objects, not only relation of inheritance.(3) Expanding the semantics of ontology to corporate it with the DOMSF data model, based upon which, an ontology representation language, named ORL, was put forward. ORL is highly expressive, and it supports both syntactic and semantic interoperability.(4) Based on the object methodology, a layered architecture for accessing dynamic web was brought forward. In this model, pages are considered as template, while data on the web are web of objects. Then, a source description language, TDL, was designed. TDL combines the functionality of HTML structure pattern with text pattern, which means it is more suitable for dynamic web content, as it suits the frequent changing date sources.(5) Upon RDBMS, the ontology-based information retrieval and optimized data storage algorithms were shown and proved of its soundness and completeness.(6) Finally, based on the above studies, a tool kit for the fast-building of WI system, namely WISK, was developed. And what’s more, a prototype application, Integrated Product Catalog for E-commerce, has been developed using WISK. Both of them were mentioned in this thesis.Web Integration has a promising future of application. Some of the possible domains include e-business, intelligent information retrieval, digital library, web data mining, enterprise knowledge portal, etc. WI would benefit greatly in all of these domains. It is never too exaggerated to emphasize that Web Integration would be one of the key techniques in the new generation of the web and new application era of e-services.

  • 【网络出版投稿人】 东华大学
  • 【网络出版年期】2004年 01期
节点文献中: