节点文献
急性髓细胞白血病基因筛选模型的贝叶斯分析
Bayesian Statistical Analysis of Acute Myeloid Leukemia Gene Expression Data
【摘要】 基因表达数据蕴含着大量的生物信息,在生物基因信息研究中,筛选表达水平发生显著变化的差异基因是认识疾病形成机理和辅助靶点药物研究的关键问题.根据急性髓细胞白血病(AML)的基因表达数据,构造基因均值差序列,建立贝叶斯分层混合模型,并为模型的参数赋予具有基因生物特征的先验信息.采用马尔可夫链蒙特卡洛(MCMC)算法对模型参数进行估计,并筛选出急性髓细胞白血病差异表达基因.在实际数据分析中,从美国生物信息中心(NCBI)的高通量基因表达数据库中获取急性髓细胞白血病基因数据集,从经过非特异滤波预处理的14688个急性髓细胞白血病基因中筛选出711个差异表达基因,差异表达基因数仅占急性髓细胞白血病基因总数的4.84%,这一结果与基因差异表达的生物学原理相吻合.
【Abstract】 Based on the fact that gene expression data includes lots of biological message,detecting differential expressed genes can make significance sense to help learn more about the diseases and the discovery of new drugs. In this paper, a Bayesian Hierarchical Normal Mixture model is constructed to detect differential expressed genes of acute myeloid leukemia,with fix components of three. Specific priors are introduced into the model, which are in some sense reflecting the biological characters of genes and make the model more practical. The parameters are estimated via the Markov Chain Monte Carlo(MCMC) method. A set of data from the National Center for Biotechnology Information in USA is analyzed. Result shows that 711 of the 14688 acute myeloid leukemia genes are differential expressed. That is to say,the number of differential expressed genes account for 4.84% of the total number of genes.The results are in consistent with the biological principle, i.e., most genes are not differential expressed.
【Key words】 hierarchical mixture model; acute myeloid leukemia; gene expression; MCMC algorithm;
- 【文献出处】 数学的实践与认识 ,Mathematics in Practice and Theory , 编辑部邮箱 ,2019年03期
- 【分类号】R733.71
- 【下载频次】74