节点文献

三维氨基酸描述子在肽类定量构效关系研究中的应用

The Application of 3D Amino Acids Descriptors to the Quantitative Structure-Activity Relationship Study of Peptides

【作者】 陈婷

【导师】 张生万;

【作者基本信息】 山西大学 , 应用化学, 2011, 硕士

【摘要】 近年来,定量构效关系(Quantitative Structure Activity Relationship, QSAR)作为一种间接方法,在计算机辅助药物分子设计中得到了广泛的应用,并已经成为一种不可或缺的工具。进行QSAR研究的关键前提和重要组成部分是分子结构参数化。众所周知,氨基酸的序列中隐藏着肽和蛋白质的功能信息及空间结构。因此,氨基酸的结构信息对肽的QSAR研究至关重要。此外,由于三维(Three dimension,3D)描述子能够直接反映受体和底物在分子作用过程中的非键合相互作用特征,因此据此所建的定量构效模型在物化意义上更为明确。本文将三种从生物分子的最基本结构特征出发,并综合立体、电子、疏水效应和分子整体三维结构信息,以及内部原子之间相互作用和外部分子影响的三维氨基酸描述子,引入几种肽类药物的结构与生物活性的QSAR模型,为将来此类药物分子的功能预测提供了理论指导。此外,文中将全部样本划分为训练集和测试集两个部分,由训练集样本建立QSAR模型,采用留一法(leave one out, LOO)内部验证对模型进行质量评价,并使用多种评价函数,对模型的外部预测能力进行了评价,确保了模型的真实有效性。本文开展的具体工作有:(1)将从20种天然氨基酸三维信息中提取出的721个描述子变量经过主成分分析(principal component analysis, PCA)而得到的三维氨基酸描述子-SVTD(Scores Vector of Three Dimension Descriptors),应用于21个后叶催产素及65个HLA(human leukocyte antigen)-A*0201限制性CTL(cytotoxic T lymphocyte)表位肽样本的定量构效研究中,取得了理想的结果。使用多元线性回归(multiple linear regression, MLR)建模,同时采用内部和外部双重验证的办法对所建模型的稳定性进行深入分析和检验。对于后叶催产素样本,所得模型的相关系数(Rum)、留一法交互校验(Cross-validation, CV)相关系数(Rcv)和外部样本校验相关系数(Qext)分别为0.981,0.962,0.966。对于HLA-A*0201限制性CTL表位肽样本,所得模型的相关系数(Rcum)、留一法交互校验相关系数(Rcv)和外部样本校验相关系数(Qext)分别为0.949,0.899,0.922。结果表明SVTD描述子能很好地表征肽类分子的结构信息,所建模型具有很好的拟合能力和预测能力,为该类药物的开发提供了理论指导。(2)将从20种天然氨基酸的空间构型中得到的WHIM(weighted holistic invariant molecular)描述子进行主成分分析得到的权重整体不变分子指数主成分得分矢量VSW (vector of principal component scores for weighted holistic invariant molecular index),应用于152个HLA-A*0201限制性CTL表位肽以及101个阳离子抗菌肽样本的定量构效关系研究中。对于HLA-A*0201限制性CTL表位肽样本,所得模型的相关系数(Rcum)、留一法交互校验相关系数(Rcv)和外部样本校验相关系数(Qext)分别为0.806,0.756,0.693。对于抗菌肽样本,所得模型的相关系数(Rcum)、留法交互校验相关系数(Rcv)和外部样本校验相关系数(Qext)分别为0.869,0.834,0.702。结果表明VSW描述子可用于肽类药物的活性预测和新型药物的分子设计。(3)将从天然氨基酸中得到的23种电子作用力,37种空间作用力,54种疏水作用力和5种氢键作用力进行主成分分析得到的分离物化性质得分DPPS(divided physicochemical property scores),应用于58个血管紧张素转化酶抑制剂和25个HLA-Cw*0102表位肽的定量构效研究中。对于血管紧张素转化酶抑制剂样本,所得模型的相关系数(Rcum)、留一法交互校验相关系数(Rcv)和外部样本校验相关系数(Qext)分别为0.943,0.909,0.916。对于HLA-Cw*0102表位肽样本,所得模型的相关系数(Rcum)、留一法交互校验相关系数(Rcv)分别为0.868,0.795。结果表明DPPS描述子因其明确的物化含义,可以用于定量构效关系模型的解释,因而可用来指导新型高活性分子的设计。

【Abstract】 In recent years, Quantitative structure activity relationship (QSAR), which is to investigate the quantitative relationship between the molecular structural parameters and biological activities or other relative activities, has got a wide and rapid development in Computer-aided drug design (CADD). QSAR, as an effective means in research and contriving medicines, has been widely applied in organic chemistry, pharmacy chemistry, environment chemistry, computer chemistry, pesticide, and molecular biology, etc. Structural characterization is crucial to performing QSAR studies for peptides and proteins. Major information of structure and function for peptides and proteins is contained in their amino acid sequences. Therefore, characteristics of the amino acid residues for peptides and proteins are of great significance to their QSAR study.3D descriptor in QSAR is a more accurate technique in structure identification because 3D descriptors will indicate non-bonding interactions of ligand-receptor. Finally, quantum chemistry, an important method in studying molecular structure and reaction theory, has been widely applied in QSAR, thus greatly increasing the accuracy of QSAR theory.In this dissertation, we developed a series of 3D descriptors based on the basic molecular structure character, considering common intramolecular and intermolecular non-bonding interactions, like electrostatic interaction, steric interaction, and hydrophobic interaction. Molecular structure parameterization methods and modeling methods have been investigated and applied in QSAR as simple, direct and effective molecular structure parameterization methods. At the same time, the quantitative relationships of several representative drug structures and activity/spectrum have been built. The results will provide some useful basic information for analyzing molecular spectrum, function, reaction mechanism, drug design, and efficiency of medicine exploitation.The main contents are as follows: (1) Scores Vector of Three Dimension Descriptors(SVTD), which were extracted from principal component analysis of 721 indexes of 20 natural amino acids, were applied to the QSAR study of 21 oxytocin analogues and 65 HLA-A*0201 restricted CTL epitopes. First, we used stepwise multiple regressions to pick the variables and then applied the multiple linear regression to the models. Finally, the models were tested by internal and external validations. For the samples of oxytocin analogues, the correlation coefficients(Rcum), cross-validation (Rcv) and external validation correlation coefficients (Qext) were 0.981,0.960 and 0.966, respectively; For the samples of HLA-A*0201 restricted CTL epitopes, he correlation coefficients(Rcum), cross-validation (Rcv) and external validation correlation coefficients (Qext) were 0.949,0.899 and 0.922, respectively, showing the model had favorable estimation and prediction capabilities.(2) Vector of Principal Component Scores for Weighted Holistic Invariant Molecular Index(VSW), which were extracted from principal component analysis of weighted holistic invariant molecular indexes of 20 natural amino acids, were applied to the QSAR study of 152 HLA-A*0201 restricted CTL epitopes and 101 Antimicrobial peptides. For the samples of HLA-A*0201 restricted CTL epitopes, the correlation coefficients(Rcum), cross-validation (Rcv) and external validation correlation coefficients (Qext) were 0.806,0.756 and 0.693, respectively; For the samples of Antimicrobial peptides, the correlation coefficients(Rcum), cross-validation (Rcv) and external validation correlation coefficients (Qext) were 0.869,0.834 and 0.702, respectively. Favorable stability and good prediction capability of the model indicated that VSW was applicable to the molecular structural characterization and biological activity prediction.(3) Divided Physicochemical Property Scores (DPPS), which were extracted from principal component analysis of 23 electronic properties,37 steric properties,54 hydrophobic properties and 5 hydrogen bond properties of 20 natural amino acids, were applied to the QSAR study of 58 angiotensin-converting enzyme inhibitors and 25 HLA-Cw*0102 epitopes. For the samples of ACE inhibitors, the correlation coefficients(Rcum), cross-validation (Rcv) and external validation correlation coefficients (Qext) were 0.943,0.909 and 0.916, respectively; For the samples of HLA-Cw*01 02 epitopes, the correlation coefficients(Rcum), cross-validation (Rcv) were 0.868 and 0.795, respectively. Satisfactory results showed that, data of DPPS may be a useful structural expression methodology for study on peptide QSAR due to their many advantages such as easy manipulation, plentiful structural information and high characterization competence.

  • 【网络出版投稿人】 山西大学
  • 【网络出版年期】2012年 06期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络