节点文献

动物数量性状候选基因的单体型分析方法研究

Study on the Methods of Haplotype Analysis in Animal Quantitative Trait’s Candidate Genes

【作者】 王起山

【导师】 潘玉春;

【作者基本信息】 上海交通大学 , 生物医学工程, 2007, 博士

【摘要】 近些年来,许多科研工作者采用候选基因法研究数量性状的候选基因。然而,我们发现候选基因分析无论是在统计方法方面还是在其应用方面都存在着大量问题:一个候选基因往往存在多个多态位点,因为不同的突变对基因功能影响的程度不同,运用不同的多态位点进行分析常得到不同的结论,而且单个位点常常效应微小,很难检测。另外,人们进行候选基因分析的最终目的在于将候选基因的基因型作为保种或选择等的依据。然而,如何运用候选基因辅助保种或选种,一直是一个未被彻底解决的问题。这里分为两种情况:一是假设只有一个候选基因,则其效应即使大得足可称为主基因,其效应仍是较小的,因此单独依其进行保种或选种将有失偏颇,即使用以辅助,其辅助作用也将极其有限。二是单独分析候选基因的多个多态位点,常常得到一个个体在同一基因上的不同位点的基因型优劣可能不一致,而使得一方面根本不存在最优秀的个体,一方面对实际存在的个体的优劣难以判断。目前,在人类基因组研究中,单体型分析已被运用到SNP位点与复杂疾病间的关联分析中。单体型,即位于一条染色体上或某一区域内的一组相关联的SNP位点。很多证据显示,在单个基因的顺式位置(例如在相同的单体型中)中的多个突变位点能够相互互作组成一个超级等位基因,且对表型起到一个很大的作用。显然,对单体型进行分析为运用SNP信息探究遗传性状尤其是复杂动物数量性状的遗传机制提供了一条更加便捷、更加有效的途径。本论文的研究工作主要包括以下内容:1基于一般动物模型的单体型分析:本论文基于当前候选基因方法研究的不足,把单体型的统计优势和传统的动物模型结合起来,建立了单体型动物模型,用于动物数量性状候选基因的筛选。针对在实际的分析中,待估参数相对较多时,常规的约束极大释然法很难求解的问题,本文用Gibbs抽样法对单体型的固定效应和其它随机效应的方差组分进行了估计,并详细介绍了其参数推导过程。用打分统计对全局和单体型特异的零假设进行了假设检验,并通过模拟研究其类型Ⅰ错误率和检出效能。单体型和性状值的产生根据单体型标签SNP的频率、单体型的多样性和有无其它系统环境效应等因素生成模拟数据。模拟研究显示,单体型多样性和单体型标签SNP的频率对模型全局零假设的类型Ⅰ错误率影响不大,模拟的结果都非常接近表中的正常的类型Ⅰ错误率α值,不论是否考虑其它固定效应,用我们提出的单体型动物模型分析方法都得到了较好的拟合效果,可见模型有较好的稳定性。全局检验的检出效能模拟结果显示,不论是否考虑含有其它固定效应时,我们提出的单体型动物模型的检出效能都较高,但中、低等单体型多样性的检出效能比高的单体型多样性的检出效能略高。单体型特异检验类型Ⅰ错误率显示,但当单体型多样性较高时,且所研究的单体型频率较小时,类型Ⅰ错误率偏离正常水平,这同全局零假设的检验结果不一致,说明当单体型多样性较高时,我们应该先求标签SNP或利用其它方法把单体型降维后,再用我们的模型来求解效应值。单体型特异检验的检出效能模拟结果显示,不同的单体型的频率对检出效能的影响不大,都有稳定较高的检出效能。当单体型多样性较高时,而特异性单体型频率较低(q=0.1)时,有无其它固定效应的检出效能都有所降低,且无其它固定效应的检出效能比有其它固定效应的检出效能略高。2基于随机回归模型的单体型分析:鉴于畜禽重要生产性状,如产奶量,产仔数,产蛋率等多属于有多次测定数据的纵向数据性状,探索这类性状的候选基因单体型分析方法,对丰富数量性状候选基因研究理论,加快其遗传改良速度具有重要的理论和实际意义。本研究在一般随机回归模型的基础上,建立了带有系谱信息的纵向数据性状的单体型随机回归模型。用Gibbs抽样法对单体型效应的固定回归系数、随机效应的随机回归系数及其方差组分进行了估计,并详细论述了其参数推导过程。并通过模拟研究模拟全局和单体型特异零假设假设检验的类型Ⅰ错误率和检出效能。模拟结果显示,对于全局零假设的类型Ⅰ错误率,不论是α= 0.05或0.01,表中的数据都非常接近正常的α值。不论是等间隔取样还是非等间隔抽样,用我们提出的单体型随机回归分析模型都得到了较好的拟合效果。不同的单体型多样性和单体型标签SNP的频率对模型的类型Ⅰ错误率影响不大,可见模型的稳定性和普适性。全局检验的检出效能模拟结果显示,我们提出的单体型随机回归分析模型的检出效能都接近于1,但等间隔取样比非等间隔取样的检出效能略高,提示研究者在实验设计时,尽量要等间隔取样,再用我们的模型进行分析。单体型特异检验的类型Ⅰ错误率模拟研究结果显示,等间隔取样时,不论是α= 0.05或0.01,表中的数据都非常接近正常的α值。在非等间隔抽样时,当单体型多样性较高时,而特异性单体型频率较低(q=0.1)时,类型Ⅰ错误率略微偏离正常水平。单体型特异检验的检出效能显示,等间隔取样和非等间隔取样的检出效能在各个检验水平上都没有显著的差异,但当单体型多样性较高时,而特异性单体型频率较低(q=0.1)时,等间隔取样和非等间隔取样的检出效能都有所降低。3梅山猪ESR基因的单体型分析:梅山猪是我国优秀的保种资源,以繁殖力高闻名于世。为了提高养猪业的经济效益、确保养猪业的可持续发展,保存进而利用好这一宝贵的遗传资源,势在必行。有鉴于此,本文以梅山猪保种群为特定研究对象,将建立的单体型随机回归模型应用到梅山猪繁殖性状候选基因研究中去,指导猪厂进行单体型辅助选择和辅助保种,以期加快其遗传改良速度。本研究针对ESR基因的PvuII、AvaI和MspA1I三个多态位点组成的单体型进行分析。结果显示,单体型ABB对各胎产仔数性状影响显著,单体型ABB对产仔数的效应值为一个平滑的曲线,且第1胎和11胎12胎为负效应,而第4到第7胎的单体型效应值较大。为便于遗传评估,将求出的单体型效应、剩余微效多基因的EBV值进行排序。由排序结果得知,1至12胎育种值之和排序同ABB单体型效应值排序的顺序是不一致的,单独按照某个指标筛选可能使优秀的个体被错误的筛选掉了,本研究采用综合排序结果充分考虑了显著影响产仔数的单体型ABB的效应和剩余微效多基因的育种值效应,按照综合育种值排序可以制定相对较合理的选种和选配方案。4单体型关联分析软件系统开发:我们在提出的单体型动物模型和单体型随机回归模型的基础上,开发了界面友好,功能强大的单体型关联分析应用软件系统,获得软件著作权两项,为近一步的推广应用打下了基础。单体型关联分析软件系统按照功能分成几个模块,各模块均能独立完成系统指定的任务,又能相互协调,完成模拟到统计分析等一系列的分析功能。其中包括基因型时间协变量等数据管理和预处理模块,模拟表型数据产生模块,单体型和标签SNP求解模块,模型判断选择模块,单体型动物模型统计分析模块和单体型随机回归模型统计分析等模块等。综上所述,本论文针对候选基因研究中存在的问题,建立了单体型动物模型和单体型随机回归分析模型,通过模拟研究证实了模型的可靠性,并将我们的模型系统应用到梅山猪繁殖性状的应用研究中。特别是针对复杂系谱资料和纵向测量数据,我们的研究结果提出了更可靠的单体型关联分析方法。本研究结果不仅能推进动物数量性状候选基因研究,而且为单体型标记辅助选种和选配的实施奠定了理论基础,也为育种和科研工作者提供便捷的应用软件系统。

【Abstract】 In recent years, many researchers studied candidate gene of quantitative trait through candidate gene methods. However, we found that the candidate gene analysis both in statistical methodology or in the presence of their applications have a large number of issues: there are a number of polymorphic loci in a candidate gene, different conclusions were got through analyzing the different polymorphic loci, and often minimal effect, it is difficult to detect. In addition, the ultimate aim of the candidate gene analysis was for conservation or selection. However, the use of the candidate gene or marker-assisted selection, have been a complex problem. Here is divided into two situations: First, assume that there is only one candidate gene, its effect even big enough to be called a major gene, the effect is still relatively small. It will be biased even if a supplement to its complementary role will be very limited. Second, a separate analysis of a number of candidate gene polymorphism loci, often in the same individual with a different gene loci on the merits may be inconsistent, and can not makes advantage of the most outstanding individual.Currently, in the human genome research, haplotype analysis has been applied to the association analysis between the SNP loci with complex diseases. Haplotype is a group linked SNP sites located in a region or a chromosome. Much of the evidence shows that in a single gene cis-location (such as in the same haplotype) mutations can be composed of a super alleles, and to play a major phenotypic role. Obviously, haplotype analysis provides a mechanism for a more convenient, more efficient way for the use of SNP to research complex genetic traits.Research contents are as following:1 Establishment of haplotype animal model:Haplotype animal model was established combining the haplotype statistical advantages and animal models. We developed Gibbs sampling method for the estimation and prediction of haplotype fixed effects and other random effect variance components. Hypothesis tests were tested through simulation of the type I error and the power. The values of Haplotype and the traits were simulated under the haplotype frequencies, the SNP haplotype diversity and environmental effects. Simulation studies show that haplotype diversity SNP and haplotype frequency have little influence on the overall situation. Type I errors of global test are very close to the normal lelvel. Whether or not to consider other fixed effects, the haplotype animal model analysis methods have better fitting results, and we can see that the model has good stability. The power of global test results show that simulated detection effectiveness, whether or not containing consider other fixed effects, the haplotype animal models have higher detection power, but the low haplotype diversity and middle haplotype diversity has slightly higher detection power than high haplotype diversity. Types I error of haplotype specific test showed that when haplotype diversity is high and the haplotype frequencies is smaller, the type I error rate deviated from the normal level. The test results were inconsistent with the globle test, and the suggestion was that we should first find tag SNP or used other methods to reduce dimension of haplotype, and then re-use our model to solve effect value. The power of haplotype specific test showed that whether bananced sampling of not, there is not significant difference, but both lose power when the haplotype diversity was high and the haplotype frequencies was much lower (q=0.1).2 Establishment of haplotype random regression model:Quantitative trait such as mild yield in dairy cows, litter size in pigs and fruit size in tomatoes are kown to change over time; they are inherently longitudinal in nature. The exploration of such traits candidate gene haplotype analysis methods, can rich quantitative trait genetic research theories and accelerate their genetic with improvement speed. This study extended application areas of haplotype association analysis. Gibbs sampling method was used for the estimation of haplotype fixed effects and prediction of random regression coefficient. Hypothesis tests were tested through simulation of the type I error and the power. Type I error of global test showed that, the type I error, regardlessα= 0.05 or 0.01, the data in the table are very close to the normal value. Regardless of the banlanced and unbanlanced sampling, the haplotype random regression analysis model has better fitting effect. Based on different haplotype diversity and haplotype tag SNP frequency on the model, the type I error has little effect on the stability of the model. The power of global test simulation results showed that the detection power of our proposed haplotype random regression model approached 1, and the banlanced sampling had slightly higher detection power than unbanlanced samping, which suggested that researchers should sample banlanced samples as far as possible in the experimental design. Types I error of haplotype specific test study showed that banlanced sampling, regardlessα= 0.05 or 0.01, the data in the table are very close to the normal value. In unbanlanced sampling, when the haplotype diversity high, and the specific haplotype frequencies was lower (q = 0.1), type I error rate slightly deviated from the normal level. The power of haplotype specific test showed that the detection power was no significant differences between banlanced sampling and unbanlanced sampling. Higher haplotype diversity, and lower specific haplotype frequencies (q = 0.1), detection power has decreased based on both the banlanced sampling and unbanlanced sampling.3 Meishan pigs ESR gene haplotypes analysis:Meishan pig is famous for the outstanding reproduction performance. It is necessary to preserve the precious genetic resource for the economic benefit and sustainable development of pig industry. This study did series of researches on the conserved population of Meishan pig from Jiading Meishan pig breeding center, Shanghai, and acquired many instructional results. We established haplotype random regression model to apply to the Meishan porcine reproductive traits candidate gene studies to guide pig factory haplotype-assisted selection and conservation, with a view to accelerating genetic improvement of their speed. We analysis three polymorphic sites (PvuII, AvaI and MspA1I) of ESR gene composed of haplotype. Results showed that haplotype ABB has a significant effect on litter size. Domestic and foreign scholars on PvuII site studies show that the BB genotype is the advantage, but our studies show that the first site for the T nucleotide on the haplotype has not significant effects on reproduction performance, but the the haplotype ABB which the first site si the C nucleotide sites has significant impact on litter size, probably because the study population has relatively unique genetic resources. In order to facilitate the genetic improvement, the results will be sought by haplotype effect and the remaining minor genes EBV value. By the results, 1-12 parity breeding value and the ABB haplotype effects are inconsistent with the order of ranking. Outstanding individual may be wrong screening according to a separate screening indicator. The results of a comprehensive study fully considering the significant impact on the litter size haplotype ABB and residual effects of minor genes breeding value could be developed relatively reasonable selection and matching program in accordance with both phenotypic and molecular information.4 haplotype association analysis software system development:We developed a friendly interface, powerful, easy-to-use haplotype association analysis application software systems, and two of these received software copyrights for the promotion of application. In accordance with the functions, the software system will be divided into several modules for system design and development. System modules can independently accomplish the tasks assigned, but also mutual coordination and complete simulation of statistical analysis to a series of statistical analysis functions. These include genotype data and time covariable management and preconditioning module, phenotypic data generated simulation module, haplotype and tag-SNPs analysis module, model evaluated module, haplotype animal model statistical analysis module and haplotype random regression model statistical analysis module.In summary, we established haplotype animal model and haplotype random regression model to screen the significant haplotypes. The simulation study has confirmed the reliability of the model and the models were used for the application study of Meishan pigs’reproductive traits. Especially for complex pedigree information and longitudinal data, our study put forwarded the more reliable haplotype association analysis methods. The results of this study will not only promote the quantitative trait candidate gene studies, but also provide a theoretical foundation of the implementation of haplotype marker-assisted selection, as well as provide convenient application software systems for breeding workers and researchers.

【关键词】 单体型随机回归动物模型贝叶斯产仔数
【Key words】 HaplotypeRandom RegressionAnimal ModelBayeslitter size
节点文献中: 

本文链接的文献网络图示:

本文的引文网络