节点文献

基于支持向量机方法的剪接位点预测

Prediction of Splice Sites Based on SVM

【作者】 李燕青

【导师】 宁正元;

【作者基本信息】 福建农林大学 , 生物信息科学与技术, 2012, 硕士

【摘要】 随着越来越多的基因组数据的产生,通过生物信息学方法预测基因成为研究基因表达和功能的重要课题。而真核细胞基因中的剪接位点对基因的功能表达有着重要的影响,因此剪接位点预测研究是基因预测中非常重要的一个子课题,对完整地认识基因有重要的意义。本文中把这剪接位点识别问题看作是利用剪接位点附近序列特征进行真假位点的分类问题。首先,采用基于混合核函数的模糊支持向量机方法对剪接位点进行识别,并且把识别效果与一般的支持向量机方法进行了比较分析。结果显示此方法相对于一般的单核支持向量机有一定的提高。其后,提出了多支持向量机方法,它通过对不同的预测信息加权组合,并利用模拟退火算法进行多支持向量机模型的参数优化选择,获得一个综合的最终结果。实验结果显示这种简单的方法也能提高剪接位点的识别率。

【Abstract】 As more and more genome data is generated, it’s the main target to make use ofbioinformatic methods to study the function and the expression process of gene. Andsplice sites of eukaryotic cells is an important factor in the expression of gene. So it isa very important part of gene prediction and helpful to understand gene function.We can consider it as a classification problem, which is studied by making use ofthe features near splice sites to distinguish the real sites from DNA sequences. At first,we make use of fuzzy support vector machine based on mixture kernels to recognizethe splice sites, and compare the results with basic support vector machine. Itsrecognition rate is higher the basic one kernel SVM methods. Then the multi-SVMs isput forward to resolve this classification problem. This method synthesizes differentprediction information with different weights to get a last result and optimize theseparameters by Simulated Annealing algorithm. We found this simple method can getbetter results than the basic support vector machine.

  • 【分类号】Q811.4;TP18
  • 【下载频次】65
节点文献中: 

本文链接的文献网络图示:

本文的引文网络