节点文献

生物磁共振数据分析中的几个问题

Several Problems in Biological Magnetic Resonance Data Analysis

【作者】 孙建强

【导师】 丁义明;

【作者基本信息】 中国科学院研究生院(武汉物理与数学研究所) , 应用数学, 2014, 博士

【摘要】 核磁共振(NMR)在物理、化学、生物、医学等多个科学领域都有了广泛的应用。生物磁共振技术为分子水平、细胞水平及整体水平的生命科学研究提供方法学支撑,包括蛋白质结构和动力学分析,代谢组学分析等。本文主要开展数学与生物磁共振的交叉研究,针对代谢组学以及蛋白动态组学两方面产生的核磁实验数据和实验现象,开展数据分析和数学建模的工作。本文分为六章,主要围绕数据分析和数学建模理论及其在生物磁共振中的应用进行研究。第一章简要介绍本文所需的生物磁共振数据分析方面的背景知识。在第二章中,研究了Renyi相对熵的中心极限定理及收敛速度。通过估计零均值的独立同分布的规范和与正态分布的Renyi相对熵,证明了与α(0<α<1)阶的Renyi相对熵有关的中心极限定理,得到独立同分布随机变量规范和Zn与标准正态分布随机变量G的Renyi相对熵的精确收敛速度,在此基础上给出一个检验残差序列是否接近零均值独立同分布的方法,为模型选择和模型诊断提供了理论基础。在第三章中,建立了代谢组学中一维谱数据的归一化新方法——聚类部分和归一化(CPIN, Clustering Partial Integral Normalization)。归一化主要是找一个合理的参考标准来衡量代谢物的变化。我们用层次聚类方法得到可能的参考组,通过平衡每个参考组的相似性与一致性,并采用OPLS提高其一致性。我们详细论述了聚类部分和归一化方法的流程及合理性,用两组数据展示其有效性。在第四章中,探讨了代谢组学一维谱数据的降维可视化,利用核方法把常规的线性降维方法拓广到了相应的非线性降维方法。首先给出了核磁共振数据降维中常用的一些线性降维方法(如PCA, LDA, PLS, OPLS)的严格数学推导过程;结合核技巧把上述PLS及OPLS线性降维方法拓广到经验核空间;并利用一组实际NMR数据展示上述方法的降维效果和核函数的参数设置对分类降维效果的影响。在第五章中,研究与生物大分子动态学磁共振实验有关的数学建模。生物大分子动态学关注蛋白质的瞬态结构及其功能,我们针对大肠杆菌的糖类磷酸转移酶系统,利用常微分方程组建立了相关蛋白质的动力学模型,阐述了蛋白与蛋白的弱相互作用以及二元磷酸基传递路径等新的核磁实验发现背后潜在的生物机制。建立一个简单的基元反应模型说明了解离常数与磷酸基传递效率之间的关系;对转运系统建立了包含二元通路的数学模型,计算了三元通路和二元通路的磷酸基转运效率。在第六章中,总结了全文的工作并提出了有待解决的问题。

【Abstract】 Nuclear magnetic resonance (NMR) is widely used in physics, chemistry, biology, medicine, and other scientific fields. Biological NMR provides methodological support for life science re-search in molecular level, cellular level and overall level, especially in protein structure&dynam-ics and metabonomics. We concentrate on the interdisciplinary field:bio-NMR data analysis and mathematical modeling. The experimental data and phenomenon are provided by our collabora-tors in Wuhan Magnetic Resonance Center.The thesis consists of six parts. The first chapter introduces backgrounds on data analysis and bio-NMR related to our work.In the second chapter, we prove a central-limit theorem of order a(0<α<1) Renyi condi-tional entropy and obtain sharp rate of convergence. By carefully analyzing the Renyi conditional entropy between the distribution of the normalized sum of iid random variables and Gaussian dis-tribution, we show the central-limit theorem related to α(0<α<1) order Renyi conditional entropy, and obtain sharp convergence rate. Such a rate of convergence is used to model selection and model diagnosis.In the third chapter, we propose a new method for the normalization of metabolomics in one-dimensional spectral data-CPIN(clustering partial integral normalization). The key idea of normalization is to select a group of bins as a reference to show the variations of metabolites. We uses the hierarchical clustering to obtain candidate groups, balance the trade off between similarity and diversity, and improve the consistency by OPLS. The procedure and the rationality of CPIN are described in detail. The validity of CPIN is demonstrated by two groups of samples of1H spectrum.Chapter four discusses the dimension reduction and visualization of the NMR spectrum of metabolites. We generalize conventional linear dimensionality reduction method to the appropri-ate nonlinear dimension reduction method by using kernel methods. We give the rigorous mathe-matical derivation of NMR data dimensionality reduction methods widely used in metabonomics (such as PCA, LDA, PLS, OPLS), then extend PLS and OPLS by using kernel methods to ker-nel space. We use real NMR data of metabolites to show the validity of the proposed nonlinear dimension reduction method.The fifth chapter depicts the mathematical modeling work in dynamics of biological macro-molecules with magnetic resonance experiments. For E. coli sugar phosphotransferase system. We establish a dynamic model of the protein using ordinary differential equations, elaborate the weak interaction of proteins. The model grasps the underlying biological mechanisms from new NMR experiments. Specifically, it explains the relationship between Phosphate group transfer efficiency and Dissociation constant through a simple reaction model. It also shows the meaning among proteins weak interaction; further transfer of binary systems containing a mathematical path model, then establishing mathematical model including binary channel to the transporting system, it could predict the Phosphate group transfer efficiency of2-pathway and3-pathway.We summarize current works and some problems for further research in chapter six.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络