

Pathway Enrichment Analysis of Endometrium Related Microarrays and Application To Pig Reproduction Mechanisms

【作者】 赵洪波

【导师】 潘玉春;

【作者基本信息】 上海交通大学 , 生物医学工程, 2010, 博士

【摘要】 繁殖性能对于养猪业而言,直接关乎生产成本,进而影响经济效益,意义重大。很多因素都会影响到猪繁殖性能。其中子宫内膜作为雌性生殖器官的一部分,是维持生理特征和生育功能的重要器官。子宫内膜具有高度增殖活性,然而它又极易受流产、感染、内分泌等多种因素的影响,导致子宫内膜的再生能力下降,容受性降低,严重影响繁殖力。基因芯片是后基因组学时代基因功能分析的重要技术之一,已广泛地应用于人类疾病和模式生物复杂性状遗传机制的研究。由于基因芯片的成本高,微阵列的制备、样品的准备与标记比较繁琐,分析系统价格昂贵,信号的假阳性率高等,使得该技术的普及与推广存在一定的难度。而且基因芯片的分析及注释需要较完备的基因组序列信息,使之在猪等家畜中的应用受到极大的限制,许多研究者只能应用低密度微阵列或跨物种基因芯片进行相关研究。在猪等农业动物的研究方面,虽然应用基因芯片已检测出许多与重要经济性状、疾病等相关的基因,但就这些基因在猪的生长发育过程和疾病发生发展中所担任的角色、相互作用机制仍未完全明确。因此对猪的遗传育种,可以从比较基因组学的角度出发,利用人及小鼠等模式生物的全基因组信息,对其进行服务。对子宫内膜相关疾病和异常的全基因组研究,在人及牛等其它物种上,积累了不少数据,而且注释信息完整。因此,我们可以通过对已有的人、牛相关芯片数据进行分析整合,再通过比较基因组学的方法进行同源基因比对,以期得到影响猪子宫内膜的关键通路和候选基因。为进一步研究影响猪繁殖性能的分子遗传机制和分子标记辅助选择育种奠定基础。本论文的研究工作主要包括以下内容:1.子宫内膜异位症芯片数据的通路富集分析及整合首先将从公共数据库中收集到的相关数据,采用统一的预处理方法进行处理,然后应用基因集富集分析方法对子宫内膜异位症相关的微阵列数据进行分析,最后对每套的结果进行整合分析。结果表明,在卵巢型的3套数据集中,发现17个共同上调的通路和23个共同下调的通路。在腹膜型的2套数据中,有26个上调通路是一致的,有1个下调通路是一致的。比较这两种类型的分析结果,发现有13个相同的上调通路和1个相同的下调通路。这些通路主要与免疫疾病和免疫系统相关。对子宫周期的数据分析结果表明,分泌早期有12个显著上调通路和18个显著下调通路,分泌中期没有显著上调通路,只有29个显著下调通路。对3个时期得到的通路进行比较,发现彼此之间交集很少。对以子宫内膜内皮细胞为样本的数据进行基因集富集分析,得到46个显著上调通路和1个显著下调通路。将其与以上结果比较,得到2个共同通路。通过GSEA方法得到的上述结果更容易解释也更可靠,提高了微阵列实验结果的可重复性,为后续实验验证指出了方向,并为在分子水平上研究猪繁殖机理提供了参考通路和基因。2.能量负平衡对子宫内膜相关通路及基因的影响研究日粮能量水平对母猪子宫内膜的影响,对于合理饲养育成期母猪以及提高繁殖率和经济效益,都具有重要的意义。本研究利用GEO数据库中奶牛能量负平衡的微阵列表达数据,采用基因集富集分析的方法,筛选能量负平衡时,子宫内膜显著差异表达的相关通路和基因,。分析得到23个显著上调通路和26个显著下调通路。上调通路中,大部分是免疫系统和免疫性疾病相关通路。其中,Toll样受体信号通路、T细胞受体信号通路等7个通路是免疫系统相关通路,而原发性免疫缺陷、自身免疫性甲状腺疾病等5个通路与免疫性疾病相关。显著下调的通路中,主要是代谢相关通路、氧化磷酸化通路和细胞周期通路等。免疫性疾病相关通路,神经退行性疾病相关通路和生物合成相关通路最多。分别将这三类通路包括的基因进行比较,得到了相关的三个核心基因列表。与免疫性疾病相关的核心基因是BOLA,BOLA-DMA,BOLA-DMB,BOLA-DQA1,BOLA-DQA2,BOLA-DQA5, BOLA-DQB,BOLA-DRA,BOLA-DRB3,CD80,CD86,FAS,GZMB,LOC512672,LOC525727和PRF1;与神经退行性疾病相关的基因列表为ATP5F1,ATP5G1,ATP5G3,COX7A2,CYC1,NDUFA5,NDUFB2,NDUFB5,NDUFB7,NDUFC1,NDUFS6,NDUFS8,NDUFV1和UQCRB;与生物合成相关通路相关的核心基因列表为ALDOC,CS,DLD,IDH1,MDH2,PDHA1,PFKM,和PKM2。3.影响猪子宫内膜的候选通路及基因筛选通过分析疾病和能量对子宫内膜的影响,我们发现了很多共同的通路和基因。这些共同的通路可以作为研究猪子宫内膜的候选通路,而共同的基因,可以映射到猪的染色体上,作为影响猪子宫内膜的重要基因。我们将人卵巢型子宫内膜异位症芯片数据和能量负平衡数据的分析结果相比,发现相同的上调通路有12个和下调通路有10个。这22个通路可作为影响猪子宫内膜的候选通路,其中包含的同源基因是我们重点关注的基因。此外,PPAR信号通路仅在一套数据中不显著,也作为候选通路纳入结果中,共得到23个候选通路。对每个通路包括的基因进行提取,合并,去重复,共得到212个人Entrez基因。将这212个人的Entrez基因通过BioMart数据库,在猪染色体上寻找对应的同源基因,并对应到相应的染色体位置上。有比对结果的Entrez基因有168个,其中122个人Entrez基因对应140个猪Unigene同源基因,其它Entrez基因没有找到对应的猪unigene同源基因,但得到了相应同源基因在猪染色体上对应的起始位置。4.猪胚胎和子宫内膜发育相关内参基因的筛选胚胎发育到胎膜与子宫内膜附着是一个渐进的过程。人们对这一过程进行了很多研究,实时定量PCR因其快速可靠的特点已经成为分析基因转录水平的常用手段,通常使用看家基因进行相对定量。然而很多看家基因随着环境的改变其表达也会发生变化。而微阵列芯片数据包含了整个基因组的信息,可以供我们筛选在特定组织中稳定表达的基因,将其作为内参基因进行相对定量。我们应用元分析方法整合多套关于猪胚胎和子宫内膜发育的芯片数据,初步筛选出表达稳定的前100个候选内参基因,大部分为编码核糖体蛋白的基因。综上所述,本文通过对人和牛相关芯片数据的分析,发现在疾病和能量差异情况下,在子宫内膜显著差异表达的相关通路和基因,并采用比较基因组学的方法,将所发现的相同通路和基因,映射到猪染色体上,并对应得到猪的同源基因。这些通路和基因可作为影响猪子宫内膜的候选通路和基因,不仅可以对现有繁殖相关基因有更深入的了解,还有助于发现新的与繁殖性状相关的重要候选基因,为母猪繁殖性状标记辅助选择(MAS)和标记辅助导入(MAI)提供新的依据。最后,利用GEO数据库中与猪胚胎发育和子宫内膜相关的芯片,采用元分析方法,为RT-PCR的的准确使用提供候选内参基因。

【Abstract】 Reproductive performance is of great significance for the pig industry which directly relates to production costs and affects economic benefits. Many factors will affect the reproductive performance of pigs. Endometrium as part of female reproductive organ is vital to maintain the physical characteristics and reproductive function. With a high degree of proliferation activity, endometrium affected by miscarriage, infection, endocrine and other factors, led to a reduced ability to endometrial regeneration, suffer decreased and seriously affecting fertility. DNA microarray is one of the most important technologies to analysis gene function in post-genomics era, already widely used in genetic mechanism of human diseases and complex traits of model organism. Being lack of sequence information, DNA microarrays was extremely limited applied in pigs and other livestock, and many researchers could only use low-density microarray or cross-species DNA microarray to conduct related research. With the high cost, platform preparation, sample and markers preparation is relatively more complicated, also the expensive analysis systems and the high rate of false positive rate et al. It is some different to popularize and promote this the technology. In swine and other agricultural animals, although the application of DNA microarrays have been detected in many of the important economic traits, disease-related genes, but the function and interaction mechanism of these genes expressed in the growth development and disease of pigs are not yet completely clear. There is still a lot of work to do to screen specificity of genes, and applied to genetics and breeding.There are some endometrium related micraorray data in human and other species at genome wide level which have complete annotation informations. Here, we integrated there data sets and then blast to pig genome by comparative genomics method. Finally, we get some critical pathways and genes may affect pig endometrium which may lay a foundation for the further research of molecular genetic mechanisms of pig reprocuctive performance and molecular marker assisted selection.In this paper, the research works include the following:1. Pathway enrichment analysis and integrated in human endometriosis microarray data sets First, related microarray data sets were collected form the public database, preprocessed with the standardised method, and then applied gene set enrichment analysis to the data, finally, integrated the results. We find 17 up-regulated and 23 down-regulated pathways common in ovarian endometriosis data sets, 26 up-regulated and one down-regulated pathway common in peritoneal endometriosis data sets. Among them, 13 up-regulated and 1 down-regulated were found consistent between ovarian and peritoneal endometriosis. The main canonical pathways identified are related to immunological and inflammatory disease. Early secretory phase has the most over-represented pathways in the three uterine cycle phases. There are very low overlapping significant pathways between the dataset from human endometrial endothelial cells and the datasets from ovarian endometriosis which used whole tissues. The results got by GSEA are more easily explained and more reliable, we have increased the concordance in identifing many biological mechanisms involved in endometriosis. The identified gene pathways will shed light on the understanding of endometriosis and promote the development of novel therapies. Also, the results could provide the reference pathways and genes for the research of mechanism of porcine reproductive.2. Pathways and genes idendified in cow endometrium with negative energy balanceTo study the impaction of dietary energy levels to sow endometrium is very important to improve the reproduction rate and economic benefits for sows breeding. We examined the specific pathways deregulated in different status of NEB. Data were downloaded form the GEO and preprocessed by RMA. GSEA was applied to the datasets. 23 up-regulated pathways and 26 down-regulated pathways were found by GSEA. Among the up-regulation list, the main canonical pathways affected related to immune system and immune disorders. Complement and coagulation cascades, Toll-like receptor signaling pathway, T cell receptor signaling pathway, B cell receptor signaling pathway, Hematopoietic cell lineage, Natural killer cell mediated cytotoxicity and Fc epsilon RI signaling pathway belong to the immune system. Primary immunodeficiency, Autoimmune thyroid disease, Allograft rejection, Graft-versus-host disease and Systemic lupus erythematosus are immune disorders. The most down-regulated pathways are metabolism, biosynthesis, Neurological disease-related pathway, Oxidative phosphorylation pathway and cell cycle.The immune disorders related pathways have a common core list of genes BOLA, BOLA-DMA, BOLA-DMB, BOLA-DQA1, BOLA-DQA2, BOLA-DQA5, BOLA-DQB, BOLA-DRA, BoLA-DRB3, CD80, CD86, FAS, GZMB, LOC512672 , LOC525727and PRF1. The Neurodegenerative Diseases related pathways have a common core list of genes ATP5F1, ATP5G1, ATP5G3, COX7A2, CYC1, NDUFA5, NDUFB2, NDUFB5, NDUFB7, NDUFC1, NDUFS6, NDUFS8, NDUFV1 and UQCRB. The common genes of biosynthesis related pathways are ALDOC, CS, DLD, IDH1, MDH2, PDHA1, PFKM, and PKM2.3. Critical pathways and genes in pig endometrium The first two parts have analysised the affecting of disease and NEB to the endometrium respectively. There are many common pathways and genes. The common pathways could be candidate pathways for pig endometrium research, and the common genes could be important genes regulate pig endotemrium. We integrated the results of the two parts by comparative genomics and supply reference for the further researches.In ovarian endometriosis, there are 17 common up-regulated pathways and 23 common down-regulated pathways. In NEB, there are 23 up-regulated pathways and 26 down-regulated pathways. Compare the lists, we found 12 common up-regulated pathways and 10 common down-regulated pathways. PPAR signal pathway only missed in one data set, so included in our list. So, there are total 23 candidate pathways. And we extracted the collective genes in each pathway, got 212 human entrez genes. In BioMart database, we match there enztez genes to pig homologue gene with chromosome loci. Finally, 168 human entrez genes have matching results. 144 human entrez genes corresponding 140 pig unigenes, other entrez genes have no homologus genes but we fing corresponding chromosome positions.4. Selection of internal control genes in pig embryo and endometrium developmentIt is a gradual process for embryonic development and the fetal membranes attached endometrium. Many researchers have been studied in this physiological period. RT-PCR has been common used to analysis gene transcription level because of the characteristics of fast and reliable. Housekeeping genes are commonly used for relative quantification. Housekeeping genes are a major component of basic physiologic processes in all the cells and not a primary target of changing conditions. But in fact a lot of housekeeping genes changed with the environment has changed. Microarray data contains information of the whole genome. We applied meta-analysis method to combine several microarray data about embryo and endoemtrium development in pig, and selecte top 100 expressed stability genes as candidate internal control genes. The list of the genes is dominated by ribosomal proteins.To conclude, we first analysis the pathways and genes affected by endometriosis and NEB in human and bovine sepreratly, then matching the common pathways and genes as critical pathways and genes for pig which could better understand the reproduction related genes and help to find new important genes for reproduction, to provid evidence for pig marker assistant selection and marker-assisted introgression. Finally, we applied meta-analysis method to combine several microarray data about embryo and endoemtrium development in pig, and selecte top 100 expressed stability genes as candidate internal control genes for RT-PCR.


