节点文献

基于几种机器学习算法的致病遗传基因位点分析

Analysis of pathogenic genetic loci based on several machine learning algorithms

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 方雅兰库在强

【Author】 FANG Ya-lan;KU Zai-qiang;College of Mathematics and Statistics, Huanggang Normal University;

【通讯作者】 库在强;

【机构】 黄冈师范学院数学与统计学院

【摘要】 基因中的SNP位点的识别与筛选已成为复杂疾病与基因关联分析研究中日益重要的课题.本文首先对某类疾病基因库采用医学上常用的位点分类方式,分别统计样本总体各个位点的基因频率,从而确定主等位基因与次等位基因,将每个位点的碱基对(A,T,C,G)信息编码转化为数值编码.其次,采用卡方检验方法粗略筛选出可能的SNP位点,最后应用随机森林算法、Bagging、AdaBoost算法、Lasso Logistic算法等机器学习算法筛选出判别结果具有一致性的基因位点,并采用Cross-Validation方法对筛选结果的有效性进行了验证.

【Abstract】 The identification and screening of SNP locus in genes has become an increasingly important topic in the study of complex diseases and gene associations. Firstly, This paper adopts the commonly used site classification methods for certain disease gene banks to count the individual sites’ gene frequency which is of the sample separately. This operation can help us determine the primary allele and the minor allele and encode the base pair(A, T, C, G) information of each locus into a numerical code. Secondly, using the chi-square test method to roughly screen the possible SNP loci were used. Finally, the machine learning algorithm such as Random Forest algorithm, Bagging, AdaBoost algorithm and Lasso Logistic algorithm was used to screen the loci with consistent results. The Cross-Validation method was used to check the validity of the screening results.

【基金】 2018年黄冈师范学院教育硕士教学案例项目(JYJXAL2018001)
  • 【文献出处】 黄冈师范学院学报 ,Journal of Huanggang Normal University , 编辑部邮箱 ,2019年03期
  • 【分类号】TP181;R394
  • 【被引频次】2
  • 【下载频次】184
节点文献中: 

本文链接的文献网络图示:

本文的引文网络