节点文献

种子性状QTL作图新方法

A Novel Method for Mapping QTL Controlling Seed Traits

【作者】 胡治球

【导师】 徐辰武;

【作者基本信息】 扬州大学 , 作物遗传育种, 2007, 博士

【摘要】 谷类作物种子的整个发育过程是在母体植株上完成的,母株为种子提供库容和营养,因此种子胚乳性状和胚性状的遗传表达除了受胚乳和胚自身基因型控制以外,还可能受到母株基因型影响。尽管当前已有学者提出一些专用的胚乳性状QTL作图方法,但这些方法均忽略了种子性状遗传表达中可能存在的母体效应。为此,本文根据胚乳性状和胚性状的遗传表达特点,提出了一种新的包含母体效应的种子性状QTL区间作图方法。该方法采用基于EM算法实现的极大似然分析方法估计QTL的遗传参数。基本过程如下:(1)利用分离群体的分子标记基因型信息,推断种子胚以及胚乳基因型的条件概率;(2)依据贝叶斯公式,利用QTL基因型的条件概率和种子性状表型观察值计算各QTL基因型的后验概率;(3)根据后验概率计算有关缺失变量的条件期望,再由条件期望获得遗传参数的估计值;(4)重复2、3两步直到收敛为止,收敛时的参数估计值即为相应参数的极大似然估计值。由于种子基因型和其着生的母株基因型均可能对种子性状产生影响,因此,在种子胚以及胚乳性状的表达中要同时考虑亲子两代的QTL基因型,即QTL联合基因型。本研究考虑两种不同的分子标记信息利用方案。方案1是仅利用分离群体母株的分子标记基因型推断该植株上自交种子胚或胚乳的QTL联合基因型;方案2则同时利用母株的分子标记基因型以及母株上自交种子胚的分子标记基因型共同推断种子胚或胚乳的QTL联合基因型。在此基础上,我们针对胚乳和胚的遗传特征分别发展了适用于胚乳性状和胚性状的QTL作图方法。方法的可行性和有效性通过计算机模拟数据进行分析验证。本文模拟研究供试因素包括:QTL遗传力、分离群体植株数以及每株测定的种子数。每一处理均重复模拟100次。考察指标包括:QTL的统计功效以及QTL位置和效应估计的准确度和精确度。模拟研究如下:方案1:仅利用母株的分子标记基因型本方案仅需提供母株的分子标记基因型以及种子性状表型观察值即可完成全部分析。鉴于胚乳和胚之间存在倍性差异,我们针对该二类性状分别给出了不同的分析模型,并分别进行模拟分析验证。胚乳性状研究中共采用了3个供试因素不同水平组成的36个处理组合。模拟结果表明:(1)本文提出的包含母体效应的种子性状QTL作图新方法在胚乳性状作图中均具有较高的统计功效,36个模拟处理中,新方法仅对5个处理未能发现全部QTL,其余31个处理的QTL统计功效均达100%。此外,100个F2植株、每一植株仅测定10粒胚乳,新方法即有100%的统计功效发现遗传力仅为5%的QTL。(2)新方法在不同的QTL表达模式下均能够准确估计潜在QTL的所有遗传效应,有效地避免了因模型缺陷所导致的参数估计值系统偏差。例如,200个F2植株、每一植株测定20粒胚乳,无论是对遗传力较大的QTL还是遗传力较小的QTL均有较好的检测效果以及QTL位置与效应的较精确估计。胚性状研究中共采用了3个供试因素不同水平组成的12个处理组合。各处理按3种模式产生模拟数据,对应胚性状的不同表达方式,包括组成型表达(模式1)、仅在植株组织中表达(模式2)和仅在胚中表达(模式3)。每套模拟数据分别使用本文提出的新方法(方法I)、忽略胚遗传效应的母体效应模型方法(方法II)和忽略母体遗传效应的胚效应模型方法(方法III)进行分析。模拟结果表明:(1)在3种表达模式下,方法I的QTL发现能力均稍高于方法II和III。例如,对于遗传力为10%的QTL,同样采用500个植株、每株单粒测定20粒胚,模式1下,仅方法I可以准确估计潜在QTL的所有遗传效应,方法II和III的估计值均与设定的真值有较大的偏差;模式2下,方法I和II的参数估计值与真值相近,而方法III的遗传效应估计值存在系统偏差;模式3下,方法I和III的估计值接近真值,方法II则无法准确估计相应的遗传参数。(2)除遗传力和样本容量之外,QTL的表达模式同样会影响其被检测效率。当母体效应在胚性状的变异中占据更大的比重时,QTL更容易被检测。例如,同样利用100个F2植株的分子标记基因型和每株5粒胚性状观察值,对于遗传力为5%的QTL,3种方法的统计功效在模式1下分别为77%,74%和75%;在模式2下分别为81%,81%和80%;在模式3下分别为27%,18%和18%。方案2:同时利用亲子两代分子标记基因型由于方案1的QTL联合基因型是根据母体QTL基因型间接获得,而非由分子标记基因型直接推断。为了消除或减小子代QTL基因型对母体QTL基因型的依赖,我们进一步提出利用母株和胚两代分子标记基因型联合推断QTL基因型的作图方法。该方法较方案1更为复杂,应用中需要同时提供母株和胚的分子标记基因型以及种子性状的表型观察值。方案2的模拟研究同样针对3个供试因素设置了12个处理,各处理分别采用方案1的胚性状模拟研究中的3种方法进行分析。模拟结果表明:(1)子代自身分子标记基因型信息的使用进一步提高了QTL联合基因型推断的准确性,从而分析具有更高的统计功效。例如,如果仅利用单世代的标记信息,对比本研究的12个处理,胚乳性状和胚性状分别有1个处理和2个处理未能发现全部QTL;而利用两代分子标记基因型联合推断QTL联合基因型的作图方法,在供试的所有处理的统计功效均达100%。(2)本方案所有遗传参数估计值的标准差均小于方案1下的相应结果,说明利用亲子两代标记基因型联合推断QTL联合基因型的方式在一定程度上减小了子代QTL基因型推断对母体QTL基因型的依赖,从而显著提高分析结果的可靠性。然而,由于自花授粉植物亲子两代间的基因型本来就存在相关性,且即使采用两代分子标记基因型的信息亦不能从先验信息上区分胚乳的两种杂合基因型,因此胚乳性状两个显性效应只有在相对较大的样本容量下才能被准确估计。

【Abstract】 Crop seeds are formed and developed on the maternal plant which plays a pivotal role in the development phase of embryo and endosperm. The genetic expression for seed traits in crop seeds can be controlled exclusively by the embryo or the maternal genotypes and sometimes by both simultaneously. Statistical methods designed specifically for mapping QTL controlling endosperm trait have been proposed by several researchers in recent years, but these methods all ignored the influence of the maternal genome upon the development of seed. On the basis of the expression feature of seed, a new statistical method was proposed for the identification of expression mode and mapping of QTL controlling embryo and endosperm traits. The maximum likelihood method implemented via the expectation maximization algorithm was used to estimate parameters of a putative QTL. The algorithm may be summarized in the following steps: (1) Calculate the conditional probabilities of the QTL genotypes by using the molecular marker information derived from the segregation population; (2) Calculate the posterior probabilities by combining the conditional probabilities above and the phenotypic values of the seed trait; (3) Calculate the expectations of missing values, and then solve formulae to get the estimates of all genetic parameters, which are then used to update the initial values, and here ends the first iteration; (4) Repeat steps 2 and 3 until convergence. Estimates at convergence are the MLE of the parameters.Since seed traits can be influenced by the maternal and the offspring genomes simultaneously, the genetic effects from the seed and its maternal genomes should be modeled as a form of joint maternal-offspring genotype. Two design strategies are adopted for genotyping of molecular makers. Strategy 1, conditional probabilities are inferred by the marker genotypes derived from maternal tissues merely. Strategy 2, marker genotypes derived from the embryo and its sporophyte are used jointly for calculation of conditional probabilities. Given that the endosperm and the embryo have different ploidy levels and are formed through different inheritance mechanisms, endosperm and embryo traits are modeled separately. Extensive simulations were performed to investigate the statistical properties of proposed approach.Factors considered in the simulations include: QTL heritability, number of plants in the segregation population and number of endosperms collected per plant. Each treatment combination of the simulation experiments was repeated 100 times. The principal statistical properties to be investigated include empirical statistical power, precision and accuracy of estimates for QTL location and effects. Two simulation strategies are summarized as follows.Strategy 1: marker genotypes derived purely from the maternal genome. In this strategy, molecular marker genotypes through maternal genome and phenotypic observation for quantitative seed traits are required. Considering genetic difference between endosperm and embryo, we proposed two specific genetic models for them.The total 36 treatment combinations of experimental factors are adopted to investigate the performance of this method in mapping endosperm traits. The results show that (1) Only 5 out of the 36 treatments have powers less than 100% while all other treatments in the simulation studies have perfect statistical power, suggesting that the new model is highly powerful in detecting the QTL controlling endosperm traits. Even though in the setting when the number of endosperm is 10 and only 100 F2 plants, the proposed method still has power of 100% in detecting the QTL whose heritability is only 5%. (2) The method definitely shows high precision and accuracy in estimating QTL position and effects in most schemes; As shown in the results, with the intermediately dense markers of 10cM, 200 F2 plants and 20 endosperms per plant, the new method provides accurate estimates of both the QTL effects and locations with high statistical power, regardless of the levels of QTL heritability.The statistical properties of this method in mapping embryo traits were further investigated via the simulation of 12 treatments. The genetic effects of the QTL were assigned based on 3 different schemes, each representing a specific expression pattern including: Scheme 1, embryo trait affected by both QTL genotype of embryo and that of its maternal plant; Scheme 2, only the maternal QTL genotype influences the involved character; and Scheme 3, only embryo genotypic effects exist. All simulation data were analyzed under full model (method I), maternal model (method II) and embryo model (method III), respectively. The results show that: (1) method I has higher statistical power than the other two methods. Given F2 500 plants and 20 embryos per plant, under the first scheme, only method I can properly estimate all the genetic parameters of QTL, but estimates from methods II and III are significantly biased. Under the second scheme, estimates from both methods I and III are close to true values, but the results from method II are biased. Under the third scheme, both methods I and II can properly estimate all genetic effects while estimates from method III are biased. (2) Simulation studies also suggest that the statistical power detecting QTL is influenced by the QTL expression pattern as well as the heritability and the sample size. The QTL can be detected easily especially when the maternal effects have larger contribution to the phenotypic value. Given 100 F2 plants and 5 embryos per plant, the detection powers of the 3 methods are 77%, 74% and 75% under scheme 1; 81%, 81% and 80% under scheme 2; 27%, 18% and 18% under scheme 3, respectively.Strategy 2: marker genotypes derived from both maternal and offspring genomes. In strategy 1, the conditional probabilities of the joint maternal-offspring QTL genotypes for each individual in the population are inferred from QTL genotype of maternal plant rather than from seed genome. Therefore, the conditional probabilities for all QTL genotype of the seed are largely dependent on that of its maternal plant. So we further propose strategy 2 which uses the molecular makers derived from both maternal and offspring genomes to infer the probabilities of the joint maternal-offspring genotypes of the underlying QTL. This strategy is more complicated than the previous one in that it needs genotype of the embryo in addition to the genotypes of the maternal sporophyte and phenotypic value of seed traits. The statistical properties of this strategy were investigated via the simulation of 12 treatments for embryo traits and endosperm traits. The results show: (1) this strategy is more powerful due to the use of the additional molecular markers from embryo genomes. For endosperm and embryo traits, there are 1 and 2 treatments, respectively, have power less than 100% when only the maternal marker is involved. However, when the embryo was also genotyped, all treatments in the simulation studies have perfect statistical power. (2) Parameter estimates from this strategy have smaller standard deviations than those from strategy 1, suggesting that using the molecular markers from embryo and its maternal tissue can decrease the correlation between the calculated conditional probabilities of the genotypes for the two generations. Because of the unavoidable correlation between offspring and its maternal genotypes and the fact that the two endosperm heterozygous genotypes always shared equivalent conditional probabilities, we suggest that a reasonably large population should be used for the proper estimation of the endosperm dominant effects.

  • 【网络出版投稿人】 扬州大学
  • 【网络出版年期】2007年 06期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络