

Analysis of Patterns of Collaboration of Authors in Coauthorship Networks

【作者】 吕海拜

【导师】 冯玉强;

【作者基本信息】 哈尔滨工业大学 , 管理科学与工程, 2010, 博士

【摘要】 科学技术发展的日新月异以及科学研究的不断深入,使得科研人员很难单独完成某项课题或论文。通过协作分工、共同探讨,可以提高科研成果水平和研究效率。而学术文章作为科研工作成果的主要形式,也呈现出多个作者合著发表的现象,几乎所有的学术文章都是由几个作者合著完成发表的,很少出现一篇文章只有一个作者的情况,而且这种现象越来越明显。另一方面,科技管理工作中的诸多环节,比如评审专家遴选、科技政策的制定,都涉及到对专家学者的评价。因此,在对专家学者进行评价的时候,不应仅仅关注如职称、发表文章的数量、质量等这些自身的信息,也应该考察他们的科研合作行为,从合著发表文章这一合作关系的角度来进行。在这样的背景下,本文基于合著网络(合著网络是由节点和边组成的,节点表示作者,边表示两个作者之间共同发表过文章),将研究的重点确定为合著网络中作者的合作模式。同时,将合著网络中作者的合作模式分为中心度、不同研究方向的分布模式、基于网络结构的作者角色几个方面。其中,作者的中心度又分为考虑合作关系强度的作者的特征向量中心度和广度中心度两个部分。在数据集方面,选择了ACM SIGKDD知识发现和数据挖掘国际会议论文集中的文章,在此基础上构建了ACM SIGKDD合著网络。本文把度量两个地区之间合作关系强度的Salton方法应用于合著网络,来度量作者之间的合作关系强度。在此基础上,把这种合作关系强度引入到特征向量中心度中,并且分析了在考虑和不考虑合作关系强度这两种情况下,特征向量中心度有何不同。结果表明,基于Salton法的合作关系强度的引入确实给特征向量中心度的计算结果带来了影响。此外,还分析了考虑合作关系强度的特征向量中心度和度中心度之间的相关性。作者的合作模式还包括作者合作关系的广度这一方面。基于作者合作关系的广度的思想,提出了一种新的度量作者合作活跃程度的中心度——广度中心度。所提出的广度中心度的计算方法,是基于Shannon的熵的计算原理的,并且应用了基于Salton法的合作关系强度以及由lambda集合所定义的子群体。结果表明,具有较高的广度中心度的作者也会具有较高的介数中心度,反之,则不一定成立。此外,还分析了广度中心度和度中心度之间的相关性。在作者研究方向的分布模式这一方面,本文又将其分为作者研究方向的数目、作者研究方向的异质性、作者研究方向分布的均匀性三个小的方面。在提出作者研究方向的异质性和作者研究方向分布的均匀性的计算方法以后,重点分析了作者研究方向的分布模式和度中心度以及广度中心度之间的相关性。此外,还对作者进行了分类。最后,本文应用自同构对等性,分析了基于网络结构的作者角色,并在此基础上,从这些作者角色中抽象出一些典型的角色。结果表明,在应用自同构对等性来进行合著网络中作者的角色分析的时候,不能仅仅从绝对精确的结果来考察,这往往会忽略一些处于比较重要位置的作者的角色。并且,作者在网络中的角色也弥补了度中心度的不足。本文的这些研究,不但丰富了合著网络研究方面的理论成果,还将为科技管理工作以及科技政策的制定提供有益的借鉴。

【Abstract】 The fast development of the technology and the scientific research make researchers hard to complete projects and papers. The collaboration can improve the research production and efficiency. As the primary production of the research, the papers are nearly done by several authors, and the phenomenon is more and more prominent. On the other hand, many parts in the technology management such as the choice of experts to examine projects and the constituting of scientific policy are involved in the evaluation of experts. Therefore, when evaluating experts, we should also attach importance to the collaboration in scientific research.Under the background, the present dissertation makes the patterns of collaboration the keystone in our research. And divide the patterns of collaboration into authors’centralities, the patterns of distribution of different subjects and the roles based on the co-authorship network structure. The centralities are then divided into the eigenvector centrality based on the collaborative strength and the extensity centrality based on the collaborative strength. As for the data set, we choose the papers in proceedings of the international conference on ACM SIGKDD. Based on the data set, we construct the network termed the co-authorship network of ACM SIGKDD.We apply the Salton’s measure which is used to measure the collaborative strength between two regions to the measurement of collaborative strength between two authors. And then apply this collaborative strength between two authors to the eigenvector centrality. We also analyze the differences of the eigenvector centrality between considering this collaborative strength and not. The results indicate that this collaborative strength really make some differences. In addition, the correlation of the eigenvector centrality based on collaborative strength and the degree is analyzed.Authors’patterns of collaboration also include the extensity of authors’collaborative relationships. Base on the extensity of authors’collaborative relationships, we propose a new centrality which can measure authors’degree of activity of collaboration, namely the extensity centrality. This new centrality is based on Shannon’s entropy, the collaborative strength and the communities based on the lambda sets. The results indicate that authors with high extensity centrality will also have high betweenness. But the opposition is not necessarily the case. In addition, the correlation of the extensity centrality and the degree is analyzed.As for the patterns of the distribution of subjects, we divide them into the number of authors’subjects, the heterogeneity of subjects and the equality of the distribution of subjects. After proposing the computing methods of the heterogeneity of subjects and the equality of the distribution of subjects, we analyze the correlation of the patterns of the distribution of subjects and the degree and the extensity centrality. In addition, we also do the classification of authors.Finally, we apply the automorphic equivalence analysis to the analysis of authors’roles in co-authorship networks. And we extractive some representative roles from these roles based on the analysis. The results indicate that when apply the automorphic equivalence analysis to the analysis of authors’roles, we can not focus only on the precise results because of its omit of some important roles. In addition, authors’role in co-authorship networks can offset the shortage of the degree.All of these researches of this dissertation will not only enrich the academic fruit, but also supply the technology management and the constituting of scientific policy.


