节点文献

刺激下拟南芥基因相关网络构建与分析

Analysis of Arabidopsis Gene-Related Network under Stimulus

【作者】 李二艳

【导师】 王淑栋;

【作者基本信息】 山东科技大学 , 计算数学, 2010, 硕士

【摘要】 研究拟南芥受刺激后基因组表达的改变,对于理解植物在受到外界环境刺激后基因组应对刺激作出反应的生物学机理具有重要意义。本文利用逆向网络建模方法,分别建立了拟南芥的正常对照组和刺激下的实验组的抽象基因和开花相关基因互信息相关网络,并对网络的结构性质进行了分析和研究,具体内容如下:对于抽象基因互信息网络,运用复杂网络统计量的分析方法进行比对,发现实验组与对照组的网络结构具有明显差异,并给出可以有效区分这两类网络结构差异的参数:平均度、聚类系数、模块度、非孤立点所占比例等。提出了多维参数空间中显著性分类的方法,并从网络结构的整体角度证实了上述差异性的存在。对于开花时间相关基因网络,通过比对实验组和对照组网络结构,发现可以区分对照组与实验组网络结构的特征参数:平均度、平均核数。给出从网络结构角度发掘对开花有重要影响的“结构性关键基因”的有效方法。得到对度和核数库模式贡献最大的结构性关键基因的个数分别为8个和11个。发现基于度库模式得到的8个基因都在基于核数库模式得到的11个基因中。结构决定功能的思想启示,与库模式相同的基因可能对开花起促进作用,与库模式相反的基因可能起抑制作用。TAIR数据库中资料表明11个基因中8个符合上述规律,即该模型预测的有效率为72.73%。因此,该模型可以预测基因对开花的促进或抑制功能。研究中发现规律:当基因在长光照实验中的核数和度相对短光照情况下较高时,基因可能对开花起促进作用;当基因在长光照实验中的核数和度相对正常条件下较低时,基因可能对开花起抑制作用。TAIR数据库中资料表明8个基因中5个符合上述规律,即该模型预测的有效率为62.5%。所以,该模型也可以预测基因对开花的促进或抑制功能。上述网络建模与模型分析的方法用于其他植物刺激下的数据表达谱数据时,对于理解基因组应对刺激的机理研究应该具有普遍的意义。

【Abstract】 It is important to understand the physiological mechanism of genome response to external stimulus by studying the change of Arabidopsis genome expression under stimulus. In this dissertation, by the method of reverse network modeling, the mutual information networks of normal (control group) and stimulus (experimental groups) of Arabidopsis abstract genes and flowering-related genes are constructed respectively. Then analyze and study the structural properties of the networks. The details are as follows:For the mutual information networks of Arabidopsis abstract genes, through statistical analysis and comparison, the significant difference is found between the network structures of control group and experimental groups, and statistics are given which can distinguish the two types of networks:average degree, clustering coefficient, modularity, the proportion of non-isolated node. What’s more, the method to classify the training sets of normal and each stimuli in multi-parameter space is proposed. The existence of differences is confirmed from the node of view of the whole network structure.For mutual information network of Arabidopsis flowering-related genes, through statistical analysis and comparison, the significant difference of the network structures between control group and experimental groups is found, and the statistics that can distinguish the two types of networks are given:average degree, average coreness. Then, the method to mine "the structural key genes" of the flowering is given. Using the method, the patterns of coreness and degree "the structural key genes" are obtained which contribution the two types of patterns the largest. The numbers of key genes are 8 and 11 respectively. Furthermore, the 8 genes are fully included in the 11 genes. That structure determines function inspirits the gene whose pattern is the same to the pattern of database may play positive role on flowering and the gene whose pattern is contrary to the pattern of database may play negative role on flowering. The information from TAIR database indicates that there are 8 genes out of 11 genes satisfying the law. That is, the efficiency to predict of the method is 72.73%. Therefore, the method can predict the functions of the remaining genes. In the research, the law is found:when the coreness and degree of gene in the long day experiment is higher than normal condition, the genes play a positive role on flowering; conversely, when the coreness and degree of gene in the long day experiment is lower than normal condition, the gene inhibits the flowering. The information from TAIR database indicates that there are 5 genes out of 8 genes satisfying the law. That is, the efficiency to predict of the method is 62.5%. Thus, the method can predict the functions of genes, too. When the methods of network modeling and model analysis are used for analyzing other plants’ expression profile data under stimulus, it is universal significant to understand the mechanism of genome under external stimulus.

节点文献中: