节点文献

Meta分析及生物统计模型在PAHs致人群健康损害危险度评价中的应用研究

Application of Meta-analysis and Biostatistical Models for the Risk Assessment of Population Health Hazard Exposed to PAHs

【作者】 石修权

【导师】 王增珍;

【作者基本信息】 华中科技大学 , 流行病与卫生统计学, 2009, 博士

【摘要】 多环芳烃类化合物(polycyclic aromatic hydrocarbons,PAHs)是一类重要的环境化学污染物(environmental chemical pollutants,ECPs),研究表明其对人体健康具有多种危害,如导致DNA的损伤,甚至肿瘤的发生。但是以当前的经济技术条件,我们不可能实现PAHs的零排放,PAHs污染现象会长期存在,如何正确地评估PAHs暴露的危险度对我们采取相应的防治措施意义重大。本研究通过Meta分析对现有文献进行评价并利用神经网络、多因子降维等一系列生物统计模型与方法,对中国北方某钢铁厂职业人群现况调查资料、南方某医院和社区为基础的人群病例-对照研究的资料进行分析,探讨了中国职业和非职业人群的PAHs暴露的危险度以及危险度评价的多种模型与方法的应用价值。第一部分Meta分析方法学及其在PAHs危险度评价中的应用研究目的通过对Meta分析方法学的研究,阐明如何应用Meta回归、亚组分析去识别与处理Meta分析资料的异质性;以及如何正确地选用Egger’s test、Begg’s test方法对发表偏倚进行检验和判断。并在方法学研究的基础上,对肺癌与中国人群CYP1A1(cytochrome p4501A1)与GSTM1(glutathione S-transferase M1)多态性的关系进行Meta分析,以寻找基因多态性改变肺癌易感性的证据。资料与方法1.资料一:利用女性被动吸烟与肺癌关系的文献数据为例建立Meta回归模型,筛选出异质性的影响因素,根据该因素做亚组分析,观察异质性的变化。2.资料二:对28篇胃癌与白介素文献资料采用随机和倾向性缺失手段各产生11份新样本,然后行Egger’s test、Begg’s test、漏斗图等检验其发表偏倚,比较结果的差异,并同时做异质性和样本含量的正态性检验。3.资料三:通过全面收集国内外数据库近20年关于中国人群CYP1A1基因MspI和Exon7多态性和GSTM1基因多态性与肺癌关系的相关研究进行Meta分析并与不同种族的同类研究结果进行比较。结果1.资料一:Q=44.71,df=27,P=0.017,认为存在异质性。经Meta回归分析,从潜在异质性的因素中筛选出样本含量、地区为可能的异质性因素(P=0.012,P=0.091)。经亚组分析,异质性明显减小。2.资料二:(1)随机缺失时,Egger’s test、Begg’s test的P值均大于0.05;倾向性缺失时,Begg’s test有5项的P值小于0.05,而Egger’s test的P值都大于0.05,漏斗图均欠对称;(2)倾向性缺失下异质性和正态性检验P值几乎都小于0.05。3.(1)CYP1A1 MspI,Exon7及GSTM1多态性与肺癌关系,合并效应分别为比值比(odds ratio,OR)=1.35,95%可信区间(confidence interval,CI)=(1.11,1.65),Z=3.01,P=0.003;OR=1.55,95%CI=(1.22,1.97),Z=3.61,P<0.001;OR=1.63,95%CI=(1.40,1.88),Z=6.51,P<0.001。(2)与同类研究比较,高加索人种中上述两基因的多态性改变对肺癌易感性未见显著增加或者增加不及中国人群。结论1.Meta回归法对筛选异质性影响因素简便可靠,据此进行的亚组分析能明显降低亚组内异质性。2.(1)随机缺失并不导致发表偏倚;倾向性缺失导致发表偏倚,此时Begg’s test较Egger’s test更容易检出发表偏倚的真实情况;(2)可能影响Egger’s test对发表偏倚的功效的因素有非正态和异质性等。3.CYP1A1与GSTM1基因的多态性可增加中国人群肺癌的易感性,且此改变存在着明显的种族差异。第二部分生物统计模型在PAHs致人群健康损害危险度评价中的应用目的暴露在PAHs污染环境中的人群,可能造成DNA、染色体或其他损伤甚至有致癌作用。如何通过分子标志物去识别和预测暴露个体的早期健康损伤,评价其剂量-反应关系,以及做基因、环境交互作用的分析,这在环境污染人群健康危害的危险度评价中是具有重要意义的。方法资料一收集某钢铁厂330名员工资料,检测每个研究对象早期的健康损伤水平,其损伤程度由下面四个生物标志物去衡量,包括微核频率,热休克蛋白70水平、BPDE-白蛋白加合物和彗星尾矩。采用人工神经网络(artificialneural network,ANN)模型去预测早期健康损伤水平,并用ROC曲线评价。同时采用多重对应分析(multiple correspondence analysis,MCA)焦炉工作业场所暴露、工龄与早期健康损伤间的对应关系。资料二收集500例肺癌患者和517例对照人群的可能影响PAHs代谢的16个基因65个位点多态性资料,以及生活习惯、家族史等6个主要的环境因素,在Hardy-Weinberg平衡基础上,采用多因子降维(multifactor dimensionality reduction,MDR)分析其基因-基因及基因-环境的交互效应,并用logistic回归对交互效应的形式和效应大小进行补充分析。结果1.焦炉工资料(1)按对照组P95为划分有无早期健康损伤的界值点,采用多生物标志物联合筛检,330名工人共筛选出55个早期健康损伤阳性者,而按单一标志物筛检,只能检出22~35个早期健康损伤者。(2)为拟合ANN模型选取了工作场所暴露,血中胆固醇水平等六个易测变量,ROC曲线下面积(areaunder ROC curve,AUROC)为0.726±0.037(P<0.001)。(3)在MCA图中,焦炉工工作场所暴露、工龄、早期健康损伤情况间存在明显的对应关系。且各组的早期健康损伤率的趋势检验具有统计学显著性(Z=3.24,P=0.001)。2.病例对照资料(1)对照组的单核苷酸多态性的Hardy-Weinberg平衡检验发现其均处于遗传平衡状态(P=0.24-0.97);(2)具有醛脱氢酶2基因rs4646782位点杂合型多态性和多环芳烃受体基因rs2158041位点杂合型多态性以及烟龄较长等交互组合的“高危”人群是非上述组合的“低危”人群的肺癌发病风险的2.637倍(OR=2.637,95%CI=2.047-3.403)。(3)“rs4646782* rs2158041”基因多态性具有相乘效应,“rs4646782+rs2158041+smoking year”具有相加效应,前者乘积及后者的和每增大一个取值单位,其肺癌发病风险分别增加0.097和0.199倍。结论1.上述四生物标志物可联合用于焦炉工早期健康损伤的筛检,且联合筛检效果优于单标志物;ANN模型可用于预测早期健康损伤情况,其预测效果经AUROC证实;MCA法能较好揭示焦炉工暴露与效应间的直观的联系情况,可以用于对剂量-反应关系的辅助判定。2.rs4646782,rs2158041,smoking year三因子在肺癌发病风险中具有显著的交互效应,其“高危”相对于“低危”可增加风险1.637倍。本研究结果同时提示MDR分析方法在分析多因子疾病基因-基因、基因-环境交互作用中切实可行,具有很好的应用前景。Logistic回归可用于对交互效应的形式和效应大小的补充分析。总之,本研究论述了人体接触PAHs后的早期损伤以及各种环境因素和遗传因素(基因多态性与种族差异)对人群健康损害危险度的影响,初步建立起以循证医学Meta分析方法、生物分子标记物的选取与联合筛检、ANN模型、MCA、MDR等为基础的PAHs暴露对人群健康损害的危险度的综合评价方法和体系。研究结果对进一步探索PAHs对机体的早期损伤甚至导致肺癌的宏、微观危险度及其影响因素的研究提供了生物统计方法和应用模式,并为今后建立PAHs直至其它ECPs的预警体系打下方法学方面的基础。

【Abstract】 Polycyclic aromatic hydrocarbons (PAHs) are an important kind of environmentalchemical pollutants (ECPs) which had been reported to have health-hazard ability evento cause DNA damage and lead to carcinogenesis.But at present, we can not absolutelyprohibit PAHs emission, PAHs pollutants will not become extinct soon and evencontinually exist for a long time.How to correctly assess their risks might be anessential and valuable exercise.In this research, we carried out a Meta-analysis to appraise the relevant publishedresearches and used several biostatistical models and methods such as the artificialneural network (ANN) model, the multifactor dimensionality reduction (MDR), toanalyze a cross-sectional research in coke-oven workers in a northern steel plant, and acase-control study of a hospital-community based population in South China.Risk ofhealth hazard in Chinese occupational and ordinary population who exposed to PAHspollutants, and the values of Meta-analysis and biostatistical models in the riskassessments were explored and discussed.Part one: Meta-analysis method and its application in PANs risk assessmentObjectives (1)To explore the role and application of Meta-regression andsubgroup analyses to recognize and control the heterogeneity; (2) To explore how tocorrectly choose and judge Egger’s test and Begg’s test to detect the publication bias inMeta-analysis; and (3) Genetic polymorphisms of cytochrome p4501A1 (CYP1A1) andglutathione S-transferase M1 (GSTM1) genes are thought to have significant effects on the metabolism of environmental carcinogens, but the reported results are not alwaysconsistent.We tried to find evidences of an association between the CYP1A1 variant andGSTM1 null genotypes and increased risk of lung cancer in Chinese populations.Methods 1.Meta-regression models were established using database 1 from casecontrolstudies of lung cancer in passive smoking females to search for theheterogeneous factors, and the change of heterogeneity were compared before and aftersubgroup analyses.2.Database 2 included 28 papers about Interleukin polymorphismsand gastric cancer risk, and 11 pieces of random or tendency missing datasets wereobtained.Egger’s test, Begg’s test and funnel plot et al.were used to diagnose thepublication bias then the differences were compared.Heterogeneity and normaldistribution tests were also offered.3.Through a systematic literature search forpublications between 1989 and 2008, we summarized the data from 54 studies onpolymorphisms of MspI and Exon7 of CYP1A1 and GSTM1 and lung cancer risk inChinese populations.Our results also were compared with other ethic populations.Results 1.The heterogeneity of the database 1 was existed in the Meta-analysis(Q=44.71, P=0.017).Sample size and region were selected (P=0.012 and P=0.091,respectively) by Meta-regression.The Q values were lowered after subgroup analyses.2.For the database 2, (1) Among random missing, the results of Egger’s test andBegg’s test are all greater than 0.05; while missing with tendency, P-values of Egger’stest are all greater than 0.05 while Begg’s test are less than 0.05 and funnel plotsappeared to be asymmetrical which suggested a potential publication bias; and (2) Theheterogeneity and normal tests are almost significant in missing with tendency.3.(1) Compared with the type A, lung cancer risk for the types B and C was 1.35-fold (95% confidence interval [CI]=1.11-1.65) (Z=3.01, P=0.003); (2) The risk forthe Ile/Val and Val/Val of CYP1A1 Exon7 was 1.55-fold (95% CI=1.22-1.97) (Z=3.61, P<0.001), compared with the Ile/Ile genotype; and (3) The risk for the GSTM1null genotype was 1.63-fold (95% CI=1.40-1.88) (Z=6.51, P<0.001), comparedwith the present genotype.(4) Compared with the other ethnic, we found thesusceptivity to lung cancer in Caucasian was not obviously increased or increasedmuch less than the Chinese populations. Conclusions 1.Meta-regression method is convenient and reliable to search forthe affected factors of heterogeneity, and subgroup analyses based on that cansignificantly lower the heterogeneity.2.Missing with tendency can lead to publicationbias while random missing not.In missing of tendency, Begg’s test is easier todiagnose the publication bias than Egger’s test.Abnormal distribution of the samplesize and significant heterogeneity are the possible influencing factors of the power ofEgger’s test.3.We found evidence of an association between the CYP1A1 variant andGSTM1 null genotypes and increased risk of lung cancer in Chinese populations.Moreover, there is a significant ethnic difference in CYP1A1, GSTM1 polymorphismsand the lung cancer risk.Part Two: Application of biostatistical models in PAHs risk assessmentsObjective Populations who exposed to PAHs pollutant can cause health damageand lead to carcinogenesis.Therefore, it is critical to identify biomarkers that predictearly health damage in these exposed individuals in molecular epidemiological studies.It is also valuable to explore the dose-response relationship and gene-gene-environmentinteraction effects in the risk assessment in the PAHs exposure.Methods The database 1 included 330 steel-factory workers who were exposedto different levels of PAHs in the workplace and their levels of early health damagewere determined by micronuclei (MN) rate, heat shock protein 70 (Hsp70) level,benzo(α) pyrene diolepoxide-albumin adduct (BPDE-AA), and Olive tail moment(Olive TM).The ANN model was simulated to predict the health damage index, andthe receiver operating characteristic (ROC) curve was used to illustrate the judgmentcriteria and the ANN model.Multiple correspondence analysis (MCA) and trend Chisquarewere also offered to analyze the possible dose-response relationship betweenworkplace, service years and the degree of early health damage in these coke-ovenworkers.The database 2 of 16 genes and 65 SNPs (single nucleotide polymorphism, SNP)positions which were possibly affected the PAHs metabolism were collected from acase-control study including 500 lung cancer patients and 517 controls, as well as the 6 main environmental factors.After the tests of the Hardy-Weinberg equilibrium, thegene-gene interactions as well as the gene-environment interactions models weresimulated by the MDR software, and logistic regression was carried out to furtherobserve the form and the effect size of the interaction as a supplement to MDR.Results Coke-oven workers data: (1) There were 55 subjects with early healthdamage among 330 workers based on the multi-biomarker criteria using the 95percentile as the cut-off value, while there were 22-35 positive subjects if screening byany single biomarker.(2) Six variables which could be easily detected such asworkplace, cholesterol, were selected to simulate the ANN model.The area underROC (AUROC) was 0.726±0.037 (P<0.001).(3) Corresponding relationship andrelevance were existed among the exposure of workplace, service years and the degreeof early health damage in coke-oven workers.Moreover, the trend test was statisticallysignificant (Z=3.24, P=0.001).Case-control study data: (1) The test of Hardy-Weinberg equilibrium to all SNPs inthe control group suggested that it is a Hardy-Weinberg equilibrium population (P=0.24-0.97); (2) With the heterozygous polymorphism of rs4646782 position of thealdehyde dehydrogenase 2 (ALDH2) gene and rs2158041 position of aryl hydrocarbonacceptor (AhR) gene, as well as "long years of smoking" combinations were judged as"high-risk" population, else other combinations were discriminated as the "low-risk"population.The risk of lung cancer in "high-risk" population was 2.637 times (OR=2.637, 95% CI=2.047-3.403) as compared with the "low-risk" combinations.(3) Thecombination of"rs4646782 ~* rs2158041" gene polymorphisms had the multiplicationeffect, and "rs4646782 + rs2158041 + smoking year" has the additive effect; the productor the additive of the interaction terms increased a lung cancer risk of 0.097 or 0.199times as a value of product term was changed, respectively.Conclusions 1.(1) MN frequency, Hsp70, BPDE-AA level, and Olive TM couldbe used collectively to do a screening test of early health damage in coke-oven workers.Moreover, the effect of using multi-biomarker was much superior to any singlebiomarker.(2) The ANN model could be used to predict the degree of early healthdamage, and its performance was identified by AUROC which suggested that the determination the effect of multi-biomarker and the cut-off criterion were correctly.(3)MCA can visually reveal the relationships between PAHs exposure and effect; therefore,MCA can be applied as an auxiliary method to determine whether there is a doseresponserelationship in a research.2.The three factors including rs4646782, rs2158041 polymorphisms and smokingyear have the remarkable interaction of the lung cancer risk in the southern Chinesepopulation, and the combination of"high-risk" may increase 1.637 times of lung cancerrisk compared with "low-risk’.Moreover, our findings suggest that using the MDRmethod to analyze the multi-factor disease in the gene-gene and gene-environmentinteraction is practical and feasible and has a wide and good application prospect.Thelogistic regression may be used as a supplied method after MDR to analyze the formsand the effect size of interaction.In summary, in our study, the early health damage as well as some environmentalfactor and the hereditary factors (gene polymorphisms and race and ethnic difference)to the population-based healthy risk influence after the human body exposed to PAHswas discussed.We initially established risk evaluated system and the methods ofenvironment pollutant to the health hazard in Chinese populations based on Metaanalysis,the multi-biomarker, ANN, MCA, and MDR models.Our findings also can provide the biostatistical methods and the applicationpattern in the further researches on early health damage even lung cancer and of themacro- and micro- scope risk and damage mechanism exposed to PAHs.The results ofour research might offer a methodology foundation to establish the early precautionsystem of PAHs and even other ECPs.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络