节点文献

一种基于刻面描述的构件检索方法研究与实现

【作者】 渠成建

【导师】 陈立潮;

【作者基本信息】 太原科技大学 , 计算机应用技术, 2012, 硕士

【摘要】 软件构件技术是实现软件复用的核心技术,其基本思想就是创建可复用软件构件,并利用这些构件来开发新的应用软件。基于构件的软件开发能够有效降低软件开发成本、提高软件开发效率和软件质量。随着对基于构件的软件开发的深入研究和实践,构件的数目不断增多,导致构件库的规模不断膨胀,为构件复用者提供有效的构件检索方法就成为软件复用急需解决的核心问题。基于刻面的构件检索已得到软件复用界的广泛研究和应用,已有的方法涉及到XML、树匹配、本体等多种相关技术,但仍存在不足之处,比如没有对查询语句进行解析以及构件匹配计算不精确等问题都需要进一步探索。在已有构件刻面分类描述的基础上,给出新的构件描述模型,针对构件库中构件数目多带来检索时间长以及现有构件编码不能满足需要的问题给出新的术语编码策略,并以此术语编码建立构件术语索引,对构件库中构件进行预处理,提高检索效率。通过分析构件检索中自然语言解析的特殊性以及现有的中文分词方法,给出一种正向逐字最大匹配法,将查询语句解析成为构件库能够识别的构件术语,并通过构件术语索引查找到包含这些术语的构件。针对基于刻面描述的构件的特点,现有直接应用三种树匹配模型来计算构件间匹配的方法不太准确,给出一种新的树匹配模型即树包涵匹配,并在此基础上改进了匹配代价的计算方法,给出匹配度的概念用于描述构件之间的匹配程度,多角度的分析构件间的匹配。最后通过实验分析不同匹配度下的查全率和查准率,选取较优的匹配度阈值。将此匹配度阈值下的查全率和查准率与已有的使用空间编码和多种树匹配模型的构件检索方法进行比较,并对本文方法的检索时间进行分析,验证本文方法的有效性。实验结果表明本文方法能够在保证较高查全率的基础上,有效提高查准率和检索效率。

【Abstract】 As the critical technique to achieve software reuse, the basic idea ofComponent-Based Software Development (CBSD) is creating reusable softwarecomponents, and using these components to develop new application software.The application of CBSD can cut down the cost of software developmenteffectively while improve software quality and the efficiency of softwaredevelopment. With in-depth study and practice of CBSD, the number ofcomponents is increasing, which lead to expanding the size of the componentlibrary. Provided an effective method of component retrieval for users to reusecomponents has become the critical problem which needs to be resolved quickly.Facet-based component retrieval has become extensive research and applicationof software reuse, some better methods such as XML, tree matching, ontology,and so on being applied. But, it needs to be further explored since some issuessuch as not parse the query and component matching calculation inaccurate arestill exist.In this paper, a new component description model was given on the basis offaceted classification and description. To solve the problem of slow componentretrieval caused by expansion of component library and existing encoding can’tfill requirement, a new term encoding strategy is designed. A term index iscreated based this encoding, which can preprocess the components and improvethe retrieval efficiency.By analyzing the particularity of natural language parsing in componentretrieval, as well as the existing Chinese word segmentation method, amaximum matching method of the forward verbatim is designed. Parse thequery to component terms, and find the components which contain these termsby term index.Considering the feature of facet-based component, the existing methodswhich use the three tree matching model directly to calculate the componentmatching is less accurate. A new tree matching model, contain matching is designed in this paper and the calculation way of matching cost is improvedbased on this new model. A concept of matching degree is proposed to describethe degree of the match between components and analyze component matchingin multi-angle.Finally, analyzed the recall and precision of different matching degree, andselected the better degree as the threshold. Compared with the existingcomponent retrieval methods such as space encoding and tree matching modelin the recall and precision at selected threshold, and analyzed the retrieval time,the efficiency of this method is validated. The experiments show that the methodproposed in this paper can improve the precision and efficiency of componentretrieval while keeping a higher recall.

节点文献中: