节点文献

引文网络的可调优先粘贴模型及其应用

Adjustable Preferential Attachment Model on Citation Network and Its Application

【作者】 李粤

【导师】 李星;

【作者基本信息】 清华大学 , 信息与通信工程, 2007, 博士

【摘要】 引文网络是论文建立相互引证关系的网络。对它的分析是追溯科学发展历史,评价和预测科学发展意义、规模以及趋势的关键手段之一。结构特性的统计和演化模型的建立是引文网络研究的两个重要内容。已有的引文网络演化模型不能全面解释优先粘贴现象、节点老化现象、无尺度特性、睡美人现象和高聚集性这五个结构特性。本文的研究目标就是构建一个能全面解释上述结构特性的引文网络演化模型,并将模型所揭示的引文网络演化规律用于预测引文网络的发展。本文取得的主要成果有:1.设计了一个可调优先粘贴模型(APA模型)来描述引文网络。首先本文对引文网络形成的两个主要机制(节点老化机制和边复制机制)进行建模。然后利用解析计算和数值模拟方法,分析了APA模型中上述两个形成机制的参数对网络结构特性的影响,并得到了这两个机制和引文网络五个结构特性的关系,分析结果也说明APA模型能很好的描述引文网络,分别解释这五个结构特性。2.构造了一种APA模型参数估计方法来进一步验证APA模型对真实引文网络的合理描述。首先利用模型参数估计方法获得真实引文网络的模型参数,然后利用这些模型参数生成模型的模拟网络,并将所生成的模拟网络和真实引文网络在五个结构特性上进行一致性分析,分析结果进一步表明APA模型能合理描述真实引文网络,全面解释真实引文网络的五个结构特性。最后分析了真实引文网络具有不同模型参数的原因。APA模型对真实引文网络的合理描述能够揭示引文网络的演化规律。3.提出了一个基于APA模型的研究热点预测算法。首先根据APA模型所揭示的论文被引用数增长规律,本文提出了一个以论文最新被引用数为依据的预测算法,实验结果表明该算法的研究热点预测准确率高于其他预测算法。然后本文通过排序融合技术进一步验证了只以论文最新被引用数为依据的研究热点预测是合理的。最后本文在论文搜索引擎中加入论文最新被引用数排序,并结合查询扩展技术加快了用户对所指定研究领域的具体研究内容和研究热点信息的认识。

【Abstract】 Citation network is a network to build the citation relations between papers. The analysis on citation network is one of the key methods to review the history of science development, to evaluate and predict the value, the scale and the tendency of science development. Two important goals of the analysis are analyzing structural properties and modeling network evolution. Existing models have failed to simultaneously explain following structural properties of citation network: preferential attachment phenomena, node aging phenomena, scale-free, sleepy beauties phenomena and high clustering. This thesis proposes a model of evolving citation network which explains above properties, and applies the evolution rules of citation network indicated by this model to predict the development of citation networks. The main contributions are as follows:1. This thesis proposes Adjustable Preferential Attachment Model (APA Model) to describe citation network. Firstly this thesis proposes APA Model for the two major mechanisms of citation network, which are node aging mechanism and edge copying mechanism. The influence of the APA Model parameters of the above two mechanisms to network structure is studied through both analytical analysis and numerical simulation. The relationships between the two process of APA Model and the five structural properties of citation network are also analyzed. The analyzed relationships show that APA Model can describe citation network well and explain the structural properties, respectively.2. This thesis presents a parameter estimation method for APA Model to validate the ability of APA Model to rationally describe the real citation network. The consistency between the five structural properties of real citation networks and of simulated networks constructed according to the parameters estimated from real citation networks is analyzed, and the result shows APA Model can rationally describe the real citation network and simultaneously explain the structural properties of real citation network. The reason of the different parameters obtained from different real citation networks is also provided. The rational description of real citation network by APA Model can indicate the evolution rules of citation network. 3. Based on APA Model, this thesis proposes an algorithm to predict prospective hot research topics. According to the increasing rules of citations simultaneously indicated by APA Model, the probability to obtain new citations of one paper are predicted based on recent citations. Experimental results demonstrate that the new algorithm achieves higher prediction accuracy than other prediction algorithms. Through rank aggregation, it is confirmed that prospective hot research topics can be reliably predicted using only recent citation. Finally, the ranking of recent citations is integrated into a literature search engine with query expansion technology, the search engine can help the users obtain detailed research field and hot research topics of user-specified research field.

  • 【网络出版投稿人】 清华大学
  • 【网络出版年期】2009年 06期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络