节点文献

计算Web智能粒度粗糙理论及关键技术研究

Research on Granular Rough Theory and Key Issues in Computational Web Intelligence

【作者】 陈波

【导师】 周明天;

【作者基本信息】 电子科技大学 , 计算机应用技术, 2008, 博士

【摘要】 随着Internet的高速发展,越来越多的企业将业务应用部署在Web上,从根本上改变了用户使用业务应用、接受企业服务的模式。由于Web上的业务服务应用日趋复杂、访问量骤增,随之而来的问题是海量数据的复杂性和不确定性特征的日益突出。这种不确定的、复杂的海量Web数据,传统Web技术难以有效地处理、进行知识发现和决策制定,已经成为了制约电子商务等应用领域发展的瓶颈。以处理Web上不确定性为主要目标的计算Web智能,为解决上述困境提供了有效的方法学依据。上层信息系统结构的复杂性,要求底层提供更直接面向上层信息表示的基础理论,并用这种表示理论作为桥梁,将上层应用与Web智能机制相结合。相关工作以丰富和发展计算Web智能在粗糙计算和粒度计算两方面的理论基础为目标,从动机、理论和实现三个层面,较系统地提出了以经典粗糙集理论为基石、面向粗糙性表示语义、适于半结构化信息系统表示、基于纯粹总分学关系的粒度粗糙理论。在计算Web智能的上层应用中,提出了基于评估者成熟度特性对协同过滤算法的改进。主要创新点体现在以下几方面:(1)阐释了粗糙性表示语义是粗糙性方法学独立于其它软计算方法的基本特征之一,提出围绕粗糙集理论基本思想,调整底层表示模型,使之显式编码语义上下文。提出构造新的表示模型时,通过使表示模型具有对更广泛信息源的描述能力,扩大粗糙性方法论的适用范围,确定以“属性-值”基础上的元组形式作为新表示模型基本单元的设计思想。提出用应用语境丰富的总分学关系来描述新模型信息单元组合方式,指出基于纯粹总分学关系构建粗糙性的根本动机不仅在于利用总分学和空间信息学、本体论之间联系带来的实际效用,更重要的是作为一种尝试,来展示总分学这一理论的潜在价值。(2)以粒度计算的构造性观点,提出了表示模型粒度表示演算。粒度表示演算以三元组形式的原子信息颗粒为基本原语,表示信息源最简单的完整语义单元。原子信息颗粒通过聚合运算形成复合信息颗粒,复合信息颗粒又通过聚合运算和融合运算表示更复杂的信息结构。鉴于粒度表示演算兼具一般信息源表示模型和粗糙性方法底层表示系统的双重功能,特别定义了几种面向粗糙性构造的特殊复合信息颗粒,及专用于特殊复合颗粒的运算,讨论了信息颗粒之间的总分学关系分类及判定问题,以便为构造粒度粗糙性提供支持。此外,还提出了在粒度表示演算基础上的粒度粗糙性构造方法。首先进行条件信息颗粒到决策属性的示象转换操作,再利用重叠运算判断转换结果与决策信息颗粒之间的总分学关系,定义关于决策颗粒的正则、非正则及不相干三类条件颗粒,分别构成决策颗粒的内核、外壳及主体信息颗粒,对应粗糙性的下界近似、边界及上界近似,从而构造粒度粗糙性。阐释了粗糙总分学在结合总分学的着眼点上与粒度粗糙理论的根本差异。(3)提出将粒度表示演算扩展到本体计算环境和多智能主体系统中的方式。通过对上层本体论进行研究,利用康德合成先验思想的框架,参照各种上层本体系统的设计,从空间、时间、数量范畴、质量范畴、模态范畴和关系范畴,承袭了粒度表示演算在本体计算环境中可直接使用的概念,并对不具备的概念和运算进行了适当的补充和扩展。提出了多智能主体环境下的信息系统表达方式,用信息立方体来可视化的展示了几种特殊的信息颗粒,分析了多智能主体的认知冲突问题,并将协同过滤应用阐释为一种特殊类型的多智能主体决策系统,提出利用隐藏在多智能主体认知数量差异背后的信息来改进协同过滤算法。(4)提出依据熟练用户的评分数据来进行预测可能会达到更高的预测精度这一理性权威偏向假设,基于该假设,提出两方面的改进思想:1)增加具有较高用户评估成熟度的用户评分的权重,称为理性权威偏向感知权重调整,使较成熟用户的影响力在预测中大于不成熟的用户。2)按照一个预先给定的成熟度阈值,将所有评估成熟度小于该阈值的用户数据剪除,称为理性权威偏向感知的数据化简,即对原有的用户-项目评分矩阵进行数据化简。经三种主流协同过滤算法数据集的实验,表明所提改进算法在预测精度和性能上有较大提高。(5)给出了粒度粗糙理论面向数据表示和面向对象编程的两类原型设计方式。面向数据表示的原型利用临床医疗信息系统中,实体-属性-值模型的开源系统来进行快速原型化;而面向对象编程的原型利用Java中的类和对象来表达信息颗粒。定义了本体驱动的Web信息系统功能框架,提出以粒度粗糙理论为基础,形成计算Web智能框架的计算智能引擎。

【Abstract】 With the rapid development of the Internet, a great number of business applications have been deployed on the Web, which drastically change the way of serving customers to their requirements. However, the huge volume of the data with high uncertainty has been a major problem, which makes it hard to support knowledge discovery and decision-making. It has been a bottleneck in further developing e-Business and other application domains. Aiming to deal with uncertainty on the Web, Computational Web Intelligence (CWI) provides promising methodology to solve above problems.The complex information structures require the infrastructure to supply with a more straightforward information representation method, which incorporates built-in Computational Web intelligence mechanism. To enrich theoretic foundation of the CWI framework from both Granular Computing and Rough Computing perspectives, Granular Rough Theory is proposed to capture the representative semantics of roughness, to accommodate semi-structured information sources, and to be based on pure mereological relations. In application area, modifications based on rater maturity are proposed to improve the Collaborative Filtering algorithms.Related efforts result in following major innovative achievements:(1) The representative semantics of roughness is clarified as an essential feature that makes roughness methodology independent of other soft computing approaches. It is proposed to adjust the underlying representation model of classic Rough Set Theory, in order to explicitly encode semantic contexts underlying schema of original information tables. In the new representation model design, it is taken into account to widen the natural applicability scope of roughness methodology by accommodating semi-structured data with tuples of the "attribute-value" form. And mereological relations are used to describe the structural relations in the new representation model, due to their rich application semantic contexts. It is pointed out that the motivation to build a pure mereological approach to roughness lies in the expectation not only to make use of the close relationship among mereology, spatial informatics and ontology, but also to exhibit potential powers of mereology in the case of building interdisciplinary methodologies.(2) A new representation model called Granular Representation Calculus (GrRC) is presented. In GrRC, the primitive notion is the triple form atomic granule, encapsulating the minimal complete semantic unit of information system. Compound granules are aggregated from atomic ones, and then compose more complex structures with aggregation and fusion operations. Since GrRC plays the role of common representation model for both ordinary information sources and roughness methodology, some special kinds of compound granules with dedicated operations are defined, and mechanism for mereological relation identification is also discussed, in order to support roughness formation. Then roughness formation approach based on GrRC is proposed. By performing aspect shift over conditional granules to decisional attribute, the results are wrapped with decisional granules to identify the reciprocal part to whole relations, due to which, the conditional granules are classified into regular, irregular and irrelevant granules with respect to a given decisional granule. All regular granules aggregate into the kernel granule, standing for the lower approximation in roughness. On the other hand, the irregular ones form the boundary notion in roughness, named as hull granule. Aggregation of both the kernel and the hull granule results in the upper approximation of roughness, called corpus granule. Distinction between Granular Rough Theory and Rough Mereology is clarified for their different point of view in the case of incorporating mereology.(3) Tentative solutions are presented, to adapt Granular Rough Theory to ontological computing and multi-agent contexts. Based on design considerations in major upper level ontology, ontologically applicable concepts in GrRC are explicated or added to, in terms of space, time, quantity category, quality category, modal category and relation category in Kant’s framework of synthetic a priori. In multi-agent systems, representation for its particular information system is defined. The notion of information cube is used to visualize special compound granules, and to analyze the way of roughness formation in multi-agent systems. Epistemic collision among agents is explored with definitions of two auxiliary measurements to alleviate the problem. The paradigm of Collaborative Filtering is discussed as a special kind of multi-agent decisional information system, and the underlying information beneath differences in amount of agent’s knowledge is suggested as a crucial source for improving quality of Collaborative Filtering algorithms.(4) Hypothesis of Rational Authorities Bias (H-RAB) is proposed to capture the expectation that higher prediction accuracy can be attained by emphasizing more mature referential users. Modifications based on H-RAB are designed in two aspects: RAB-WS (Weight Scaling) is a fine tuning method by scaling original similarity weights with rater’s maturity measure, so as to increase influence of more mature referential users; RAB-DR (Data Reduction) is a more audacious one that suggests pruning all referential users with less maturity measure than a given Maturity Threshold. On three major public available Collaborative Filtering datasets, experimental results from a series of experiments empirically justify the soundness of both RAB-aware modifications and validity of H-RAB.(5) Two prototypes of implementing Granular Rough Theory are presented in the sense of data representation and object-oriented programming, respectively. Data representation oriented implementation utilizes open source clinical information system, which is based on Entity-Attribute-Value model. Object-oriented programming implementation is developed with Java classes and objects to incarnate the notion of information granules. Moreover, functional aspects of ontology-driven Web information system framework are illustrated, as a Rough, Granular Web intelligence infrastructure for upper applications.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络