节点文献

基于蛋白质组学和代谢组学的吸烟与肺癌相关性研究

Investigation of the Relationship between Smoking and Lung Cancer Based on Proteomics and Metabolomics

【作者】 胡集祎

【导师】 郑树;

【作者基本信息】 浙江大学 , 肿瘤学, 2013, 博士

【摘要】 目的和意义:肺癌是指发生于支气管粘膜上皮的恶性肿瘤,一般指肺实质的肿瘤。近年来,全球肺癌发病率男性增加了10-30倍,女性增加了3-8倍,在男性中其发病率和死亡率均居世界首位,而在女性中其发病率为世界第四、死亡率为世界第二。吸烟与肺癌的发生有十分紧密的联系,约有80%的男性肺癌与50%的女性肺癌与吸烟有关。而我国是烟草的生产和消费的大国,吸烟者的数量极为庞大,在较长的一段时间内这个数量将继续上升。并且长期吸烟的肺癌患者由于肺功能受到影响可能导致无法耐受手术。当前肺癌的主要确诊手段为CT检查及在CT引导下的穿刺活检,然而该方法费用高、检查费时,不适合于筛查,且CT无法检出一些病灶较小(<2mm)的肺癌,而CEA等生物标志物虽被用于监测复发转移,但其敏感性和特异性有限。因而,目前尚缺乏一种简便易行且具有较高的敏感性和特异性的肺癌筛查手段,特别是可以在吸烟人群(高危人群)中筛查肺癌的手段。本研究旨在通过系统生物学的方法来综合性地研究吸烟与肺癌的关系。卷烟烟气成分复杂,在机体中引起的变化也相当复杂,故需要采用系统生物学的方法来进行研究。在动物模型中探索吸烟所引起的血清蛋白质变化,评估吸烟所造成的综合效应,为后续的肺癌血清蛋白质组学提供基础。通过对比吸烟和非吸烟人群的肺癌差异性蛋白质谱和代谢产物谱,寻找在吸烟与肺癌之间起着桥梁作用的蛋白质和小分子代谢产物。通过生物信息学的方法挖掘可用于在吸烟人群中筛查肺癌的具有较高的敏感性和特异性的血清及尿液生物标志物,并建立相应的预测模型。材料和方法:本实验中,首先建立了可控的标准化的大鼠的短期和长期卷烟烟气暴露模型,短期和长期模型均分为对照组、低毒烟组和高毒烟组,每组各20只SD大鼠。短期暴露模型中,大鼠共吸烟3天,而在长期暴露模型中,大鼠共吸烟90天。大鼠模型的血清样本采用CM10芯片进行处理后,使用SELDI-TOF-MS检测,所得的质谱数据采用ZJU-PDAS生物信息学软件进行分析,在差异性表达的蛋白质峰中挑选潜在的标志物使用TagIdent工具在Swiss-Prot数据库中鉴定。收集肺癌患者血清样本59例(39例有吸烟史,20例无吸烟史)和健康对照血清103例(45例有既往吸烟史,58例无吸烟史)。同时收集有肺癌患者尿液49例(31例有既往吸烟史,18例无吸烟史)和健康对照尿液79例(36例有既往吸烟史,43例无吸烟史)。血清样本和尿液样本来自同一人群,部分尿液样本由于肌酐值过高被剔除。血清样本采用弱阳离子交换(WCX)进行处理后使用MALDI-TOF-MS检测,而尿液样本使用GC/MS检测,获得的蛋白质组学结果和代谢组学结果均采用ZJU-PDAS进行分析。实验结果:在大鼠卷烟烟气暴露模型中,共在短期暴露模型中识别蛋白质峰189个,在长期暴露模型中识别225个。以p值小于0.05和倍数变化大于1.2作为cutoff值,同时结合峰形和样本间聚类成簇的情况,在短期暴露模型中筛选到一个差异性蛋白质峰,其质荷比为在3151。长期暴露模型中筛选到7个差异性蛋白质峰,其质荷比分别为3502,3546,5653,5854,7233,7419和8005。通过Swiss-Prot数据库对这8个蛋白质峰进行了鉴定,其中在低毒烟和高毒烟组中高表达的3151的候选蛋白质为Pituitary adenylate cyclase-activating polypeptide(PACAP),而在低毒烟和高毒烟组中低表达的5653的候选蛋白质为Metastasis-suppressor KISS-1.既往的研究发现PACAP升高与缺血性心肌病及非小细胞肺癌有关,而KISS-1的降低往往可在具有高转移风险的肿瘤中观察到。在肺癌血清蛋白质组学实验中,通过比对有吸烟史的肺癌患者和健康对照的血清蛋白质轮廓图谱,结合p值、倍数变化及蛋白质峰峰形和聚类成簇的情况筛选出8个蛋白质峰,作为可用于在吸烟人群中筛查肺癌的潜在生物标志物,其质荷比分别为2485,4790,2892,6010,4878,1324,3045和4809。应用同样的方法,筛选出6个可用于在非吸烟人群中筛查肺癌的蛋白质峰,其质荷比分别为2485,2706,4790,3555,2513和2549。通过对吸烟和非吸烟人群的肺癌差异性蛋白质图谱的比较,找到了3045和6010两个峰,仅在有吸烟史的肺癌患者和健康对照的对比中有显著性差异,可能与吸烟所致的肺癌有关。同时我们建立了分别适用于吸烟人群和非吸烟人群的肺癌血清蛋白质筛查模型。用于吸烟人群的筛查模型,其预测肺癌和健康对照的准确率分别为100%和97.06%。而适用于非吸烟人群的筛查模型,其预测肺癌和健康对照的准确率分别为94.12%和100%。在人群尿液代谢组学实验中,在肺癌患者和健康对照的尿液样本中共找到20个主要的共有内源性代谢产物,包括丙氨酸,草酸,磷酸,甘氨酸,琥珀酸,尿嘧啶,丝氨酸,苏氨酸,5-氧脯氨酸,核糖,顺乌头酸,柠檬酸,葡萄糖,半乳糖,络氨酸,软脂酸,肌醇,尿酸,硬脂酸和假尿嘧啶核苷。在此基础上,利用遗传算法结合SVM(二者均包含于ZJU-PDAS中)建立了基于尿液小分子代谢产物的吸烟人群和非吸烟人群的肺癌预测模型。前者预测肺癌和健康对照的准确率分别为89.22%和86.11%。后者预测肺癌和健康对照的准确率分别为73.33%和95.35%。同时,筛选到吸烟人群和非吸烟人群肺癌尿液中差异性表达的代谢产物。在吸烟人群中筛选到的有核糖,葡萄糖,草酸,磷酸,半乳糖,肌醇,尿酸,柠檬酸和顺乌头酸。而在非吸烟人群中筛选到的有肌醇,核糖,半乳糖,葡萄糖,尿嘧啶和柠檬酸。这些小分子代谢产物可以作为潜在的尿液生物标志物在吸烟人群和非吸烟人群中筛查肺癌。通过进一步对比在吸烟人群和非吸烟人群的尿液样本中找到的差异性小分子代谢产物,发现草酸、磷酸、尿酸和顺乌头酸仅在吸烟人群的比较中有显著性差异,而在非吸烟人群的比较中无显著性差异,这四个小分子代谢产物可能与吸烟导致的肺癌有关。结论和创新点:本研究利用蛋白质组学技术在大鼠模型中探索卷烟烟气暴露所引起的血清蛋白质改变,综合评估卷烟烟气对机体造成的影响。联合蛋白质组学和代谢组学技术系统全面地研究吸烟与肺癌的相关性。在大鼠卷烟烟气暴露模型的实验中,发现了与吸烟密切相关的较为重要的8个蛋白质峰的变化,其中两个蛋白质峰,3151(高表达)和5653(低表达),在数据库中的候选蛋白分别为Pituitary adenylate cyclase-activating polypeptide(PACAP)和Metastasis-suppressor KISS-1。既往的研究表明PACAP的升高与缺血性心脏病和非小细胞肺癌有关,而降低的KISS-1水平则可在具有高转移倾向的恶性肿瘤中观察到。在肺癌的血清蛋白质组学研究中,分别筛选到适用于在吸烟人群和非吸烟人群中筛查肺癌的蛋白质峰8个和6个。并找到了3045和6010这两个可能与吸烟所致的肺癌有关的蛋白质峰。在代谢组学的研究中,分别找到吸烟人群和非吸烟人群中的潜在肺癌尿液生物标志物(小分子代谢产物)9个和6个。并通过对数据的进一步挖掘,找到草酸、磷酸、尿酸和顺乌头酸这四个可能与吸烟导致的肺癌有关的小分子代谢产物。这些找到的蛋白质和小分子代谢产物为吸烟与肺癌的相关性提供了分子层面的证据,为吸烟导致肺癌的机制研究提供了基础。同时本研究建立了基于蛋白质和小分子代谢产物的同时具备一定敏感性和特异性的肺癌筛查模型,可在吸烟人群和非吸烟人群中分别进行筛查。为肺癌的诊断和筛查、监测提供了新的想法,特别是为在为数众多的吸烟人群中更为针对性地筛查肺癌提供了方法。

【Abstract】 Objective and SenseLung cancer is the malignant tumors occur in the bronchial epithelium, generally refers to a tumor of the lung parenchyma. In recent years, the global incidence of lung cancer increased10-30times in men and3-8times in women. In men, lung cancer is the most common cancer in the world, and it has highest malignant mortality; in women, lung cancer is the fourth most common cancer worldwide, and has the second highest malignant mortality. And smoking is closely related with lung cancer.80%of the lung cancer in men and50%of the lung cancer in women are associated with smoking. While China, a big country of production and consumption of tobacco, has a very large number of smokers, and this number will continue to rise in the next few years. It is also important to notice that patients who had an impaired lung function because of a long-term history of smoking may not tolerate surgeries. Current lung cancer diagnosis mainly relies on computed tomography (CT) and CT-mediated biopsy. However, this method is not suitable for screening for lung cancer as it costs a lot of time and money, and it could not discover the small lesion of lung cancer (<2mm). Biomarkers like CEA can only be used for monitoring the relapse and metastasis due to their less sensitivity and specificity. Currently, there are no easy-to-use way to screen for lung cancer with high sensitivity and specificity, especially in the smokers (high-risk population). The purpose of this study is to comprehensively study the relationship between smoking and lung cancer utilizing systems biology. Proteomic research on smoking exposure rat models was performed in order to reveal the protein changes in serum, and to evaluate the overall effect of smoking. Look for protein and metabolites that serve as a bridge between smoking and lung cancer by comparing the protein and metabolic profiles in lung cancer patients and health control with or without smoking histories. Bioinformatics was applied to find biomarkers in serum and urine as a screening tool with high sensitivity and specificity for lung cancer in smokers and non-smokers, and to established corresponding predictive models.Materials and MethodsIn this study, short-term and long-term smoking exposure rat models were established. For each rat model,60rats were randomly assigned to control group, low and high toxicity groups. Rats were exposed to one cigarette per day per rat for3days in short-term exposure model and for90days in long-term exposure model. Rat serum samples were processed in CM10chip and examined using SELDI-TOF-MS1, and the raw mass spectrometry data were analyzed by ZJU-PDAS. Protein matches for selected protein peaks were found in Swiss-Prot database using TagIdent tool. Meanwhile,59serum samples of patients with lung cancer (34patients with previous history of smoking,20patients with no previous history of smoking) and103serum samples of healthy control (45cases with previous history of smoking,58cases with no previous history of smoking) were collected.49urine samples of lung cancer (31of which have a history of smoking) and79urine samples of health control (36of which have a history of smoking) were collected. Serum samples and urine samples were from the same population, part of the urine samples were excluded due to an excessively high creatinine value. Serum samples were processed with weak cation exchange (WCX) magnetic beads and then analyzed by MALDI-TOF-MS, while urine samples were analyzed by GC/MS. ZJU-PDAS was used for analysis of the data obtained in proteomics and metabolomics.ResultsIn rat smoking exposure models, a total of189protein peaks were identify in the short-term exposure model and225protein peaks in the long-term exposure model. A p-value of less than0.05and a fold change greater than1.2were selected as cutoff value. Combined with the shape of peak and cluster between samples,8differentially expressed protein peaks were found. One protein peak with a mass over charge ratio (m/z) of3151was observed in short-term exposure model, and the rest seven peaks were observed in long-term exposure model, and there mass over charge ratio are3502,3546,5653,5854,7233,7419and8005. Possible protein matches were found in Swiss-Prot database using TagIdent tool. And pituitary adenylate cyclase-activating polypeptide (PACAP) is a match for peak3151(highly expressed in low and high toxicity groups), while metastasis-suppressor KISS-1is a match for peak5653(low expressed in low and high toxicity groups). Previous studies showed that elevated PACAP is related with ischemic heart disease and non-small cell lung cancer, and decreased KISS-1levels can be observed in malignancies with high metastatic potential.In the investigation of serum proteomics of lung cancer, the serum protein profiles of lung cancer patients and health control with smoking histories were obtained. And8differentially expressed protein peaks were found with satisfactory p value, fold changes, shape of peak and clustering. They were2485,4790,2892,6010,4878,1324,3045and4809. These8protein peaks can be regarded as potential biomarkers to screen for lung cancer in smokers. Similarly,6protein peaks were found as potential biomarkers to screen for lung cancer in non-smokers. And these peaks were2485,2706,4790,3555,2513and2549. By comparing the differentially expressed protein peaks in smokers and non-smokers, we found two protein peaks,3045and6010, which were only differentially expressed between non-smoking lung cancer patients and health control. These two peaks may be associated with lung cancer caused by smoking. Meanwhile, two predictive models with satisfactory accuracies based on proteins were eastablished to screen for lung cancer in smokers and non-smokers.In the investigation of the urinary metabolomics of lung cancer,20differential expressed endogenous metabolites were shared in the urine samples of lung cancer patients and health control, including alanine, ethanedioic acid, phosphate, glycine, succinic acid, uracil, serine, threonine,5-oxyproline, ribose, aconitic acid, citric acid, glucose, galactose, tyrosine, hexadecanoic acid, inositol, uric acid, octadecanoic acid, and pseudo uridine. Two predictive models based on metabolites in urine to screening for lung cancer in smokers and non-smokers were established using genetic algorithm and SVM (both of these were included in ZJU-PDAS), with accuracy of89.22%(lung cancer) and86.11%(health controls) for the former model, and73.33%(NSCLC) and95.35%(health control) for the latter model, respectively. Meanwhile, we found differentially expressed metabolites in urine of lung cancer in smokers and non-smokers. In smokers, ribose, glucose, ethanedioic acid, phosphate, galactose, inositol, uric acid, citric acid and aconitic acid were found to be differentially expressed. In non-smokers, inositol, ribose, galactose, glucose, uracil, and citric acid were found to be differentially expressed. These small molecule metabolites could be used as potential urine biomarkers for screening for lung cancer in smokers and non-smokers. By further comparing the differentially expressed metabolites in urine between smokers and non-smokers, we found four metabolites that were differentially expressed:ethanedioic acid, phosphate, uric acid and aconitic acid. These four metabolites are probably involved in smoking related lung cancer.Conclusions and InnovationsIn this study, we applied proteomics technology to explore the change of serum protein caused by the cigarette smoking exposure, and comprehensively evaluated the impact of the cigarette smoking. And proteomics and metabolomics were, applied to systematically and comprehensively study the relationship between the lung cancer and smoking.We found8smoking-related protein peaks in smoking exposure rat model. Among these peaks, two protein peaks,3151(highly expressed) and5653(low expressed) were identified as Pituitary adenylate cyclase-activating polypeptide (PACAP) and Metastasis-suppressor KISS-1in database. Previous studies showed that elevated PACAP is related with ischemic heart disease and non-small cell lung cancer, and decreased KISS-1levels can be observed in malignancies with high metastatic potential.In proteomics research of lung cancer,8and6protein peaks were found to screen for lung cancer in smokers and non-smokers, respectively. Moreover, we found that protein peaks3045and6010may be associated with lung cancer caused by smoking.In metabolomics research of lung cancer,9and6metabolites in urine were found to be potential biomarkers to screen for lung cancer in smokers and non-smokers, respectively. By further analyzing the data, we found that the following four metabolites: oxalic acid, phosphoric acid, uric acid and cis-aconitic acid may be associated with lung cancer caused by smoking.These proteins and small molecule metabolites provided evidence on association of smoking and lung cancer in molecule level, and also provided a basis for the mechanism study of lung cancer caused by smoking.Moreover, predictive models with sufficient sensitivity and specificity were established based on proteins and metabolites to screen for lung cancer in smokers and non-smokers, respectively. Our study provided new insight into lung cancer diagnosis, screening and monitoring. Especially, we offered a more focused and specialized predictive model to screen for lung cancer in smokers.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2014年 03期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络