节点文献

古代方剂数据挖掘前期数据准备方法探讨

【作者】 陈金

【导师】 梁茂新;

【作者基本信息】 辽宁中医药大学 , 方剂学, 2010, 博士

【摘要】 目的及内容:通过分析古代中医药文献中方剂相关要素的构成情况及同一方剂内各要素的相互关系,探讨古代方剂数据挖掘前期数据准备过程中各种相关问题的解决方法。主要从以下角度进行:一是通过复习数据挖掘技术产生的时代背景、数据挖掘的大体过程及其相关学科、所用数据的特点及其方法学原理等情况,将研究定位于探讨古代方剂数据挖掘前期数据准备的相关问题。二是通过回顾研究中医药学的相关方法,分析数据挖掘在中医药研究中的方法学特点,认为对中医药古代文献中的信息,很难利用传统的统计学方法进行研究,利用文献学方法又必然受到人工条件所限。指出数据挖掘在古代中医药文献研究中的重要价值。三是通过回顾近年来利用数据挖掘进行研究的相关中医药领域以及古代方剂数据挖掘研究中的数据准备工作,认为要实现利用数据挖掘方法发现传统中医药学中的新知识的目的,应将相关中医药理论融入到相应的数据挖掘过程中去。同时指出,古代方剂数据挖掘的数据准备工作中还有许多问题亟待解决。四是指出古代方剂挖掘研究的目的是尽可能从全部的古代方剂文献中发现古代医家没有明确指出的方剂中药物与疾病间关系、药物间关系、药物的量效关系等知识。古代方剂数据挖掘过程中存在一些特别的情况,主要有方剂相关信息较难进行统一化、标准化处理,同一方剂内各构成要素间关系较为复杂,需要利用其他文献或方法来补充完善疾病信息和药物信息,由古代方剂文献形成的相关数据不是数据挖掘通常应对的日常事务型数据。从研究的目的和问题的特殊性出发,古代方剂数据挖掘的前期数据准备工作尤其要重视中医药知识和文献等传统方法的加入。五是分析古代方剂数据挖掘前期数据准备中形成数据过程中的相关问题,并提出以下观点、结论或结果:1.不同时期、不同类型的古代方剂文献存在差别,应根据具体情况的不同对相关文献加以取舍利用,并依据文献内容设计数据库,以确保可以容纳方剂有关的各项信息。2.形成一条方剂记录的最低条件是有对应关系的疾病和药物。病情、药物、药量、炮制、剂型等方剂构成元素的任何变化,都可能要求生成新的方剂记录。3.古代方剂文献中存在多种病情相关信息的描述方式,设计数据库时应兼顾到不同描述方式信息的保存需要。4.病情信息的录入需要切割古代文献中的病情描述文字,以分离出相关病情单位。并以适当方式对病情信息标准化。此过程中,应重视传统文献考证等方法的应用。同时要重视不同病情信息在方剂疾病构成中的权重及治法信息的处理方式。5.药名的标准化要在形成数据库后进行,在形成标准药名表的基础上,利用传统方法考证以确定药物,并重新予以标注。6.对以方为药的情况,以将方剂拆分为药物,重新加入原方剂的处理方法较为合理。7.对本草类文献中药物相关信息,应在对药名标准化后另建数据库加以保存。利用方剂库、药物库两库共有的标准药名,实现两库的无缝对接。药物库应保存药物的主治(功用)、性、味、毒性、归经、与其他药物关系等信息,并能保存相应信息的来源。8.应重视保存古代方剂中药物对应的炮制信息,炮制信息同样需要标准化。9.古代方剂中的某些剂型与其现代通常意义存在差别,如浓缩而成的“散”剂、化学反应形成的丹剂、酿制而成的“酒”剂。对此类剂型应有特别的标注方法。10.煎汤、送服丸散所用液体,汤剂的药引,成丸的粘合剂、作膏的材料等特殊原料需要在数据库中特别注明。11.方剂实际应用中,某些药物的剂型可能与全方不同,应保存每味药的实际剂型。12.为方便数据处理,将中国古代医药用度量衡制度变化分为三个阶段:汉唐时期、宋金时期、元明清时期。13.分别制表说明不同时期中国古代医药用度量衡单位与国际标准单位的换算关系,以供数据处理时参考。14.在分析钱、钱匕、字、字匕含义的基础上,首次利用《本草经集注》中提供的线索,分别从自然物体积、历史标准单位和“药升”法三个角度,将陶弘景时代的一方寸匕体积定于5毫升,并据之确定文献中的各自然物体积,并计算出钱匕、钱五匕等抄取单位体积。15.分析古代方剂文献中重量单位“分”在不同历史时期中的含义变化及其实际值,首次证明医药用重量单位“分”的含义由“铢制大称分”变为“钱制分”的时间当在元代,而非《宋史》中表述的宋代。16.古代方剂文献中非重量单位转化为重量单位是数据挖掘研究的必然要求。目前在相应古代药物品质、药材规格难以确定的条件下,不宜强行转化。“等分”可按陶弘景的解释处理。17.古代方剂数据挖掘研究要求剂量数据以统一标准表示,这个标准的方式只能是根据原始文献,以克来表示的每日用量或每次用量。18.在对各相关问题分析的基础上,形成古代方剂数据挖掘数据库表及其属性关系图(ER图)。六是分析了同一方剂内各构成要素间的关系。并在此基础上,探讨了药物对某一病情的直接治疗关系的认定方法问题、量效关系问题、习惯性药物组合的发现问题及特定方剂中君臣佐使问题。结论:1.古代方剂数据挖掘是指:依据中医药知识,利用数据挖掘技术,从古代中医药文献中选取并集成方剂数据,挖掘并用中医语言表述古代方剂数据中的新知识的过程。2.古代方剂数据有如下特点: a.古代方剂数据来源多样。b.方剂数据是代表性数据。c.古代方剂数据经常出现某些属性缺失的情况。d.古代方剂数据中通常使用的词语是不规范的。e.方剂各属性内部及属性间存在复杂多样的关系。3.对古代方剂数据的处理应遵循以下原则: a.确保可通过数据记录找到原始文献。b.方剂各项属性中的任何信息变化都要考虑新立方剂记录的可能性。c.对存在不规范用词的相关属性进行统一的标准化。d.应充分利用已知的中医药知识并重视相关学科的参与。4.对方剂及药物信息中的某些方面的属性值应进行统一的标准化处理。其中,病名、证名、病机名、症状名、治法名和药物的主治或功效等与病情有关的信息应一并处理,而药名的处理应以方剂表中出现的药名为主,并在药物各表中使用相同的标准药名。其他的标准名称表还可以有炮制方法、药物的特殊用法、文献信息等。各标准名称表建立的步骤大体类似: a.收集全部相关用词。b.分析全部用词各自的义项。c.合并相同义项并确定标准名。d.根据原文献用词、义项及标准名之间的关系,标注标准名。5.关于古代方剂数据的剂量处理:古代方剂剂量标准化表达形式是指在确定剂型、制剂方法及用法的条件下,用克来表示的药物的每日用量和每次用量。本文制作了中国古代医药用度量衡单位与现代国际标准单位的换算表,可以作为转化古代方剂原剂量的参考。本文首次通过不同的计算方法,将《本草经集注》所处时期一方寸匕的体积确定在5毫升。据此系统求得不同时代钱匕、字匕、方寸匕等抄取单位的合理范围。本文首次确定了古代医药用重量单位“分”在不同时期的现代国际标准单位值的换算方法。本文认为,目前将古代方药中非重量单位全部转换成重量单位的条件尚不成熟,对相关药量标注的处理方法暂时只能是保存原文。6.本文认为,病情之间、药物之间及病情与药物间的关系存在若干可能性。只有在排除其他可能关系后,才能认定某药物对某病情信息有直接治疗作用。除利用数据分析技术可以达到这一目的外,最直接有效的方法是利用方剂数据中的单味药构成的方剂、随证药物加减信息及药物数据库提供的主治信息来实现。本文认为在古代方剂数据挖掘研究中,确定量效关系、认定习惯药物治疗组合、发现君臣佐使的认定模式问题的解决,需要在合理准备数据、明确前述各种主要关系的基础上进行。7.准确合理的前期数据准备工作是古代方剂数据挖掘研究的必要环节,这一工作不仅要在尽量忠实于文献原意的基础上进行系统的标准化处理,同时也要考虑方剂各要素间的关系。在这一过程中,有一些共性问题需要重视,如:中医理论或结论的使用、各种属性的标准化、重视细节的分析和处理等。解决前期数据准备过程中的各种相关问题更要重视综合运用其他传统研究方法,如:中医药传统思维、文献考证、文物考察、逻辑方法、数学、实测和实验等方法。对于古代方剂数据挖掘前期数据准备工作而言,应将对各种细节严谨的处理与数据挖掘的具体目的结合起来,既要重视数据的准确可靠,也要兼顾实际的可操作性。本文探讨的内容相对较为细致,在具体数据挖掘研究中可根据实际需要进行取舍。

【Abstract】 Purposes and content:To explore the solutions for preliminary data preparation for the ancient prescription data mining related problems, this paper analyzes the situations of composition factors of prescriptions from the ancient Chinese medical literature. The paper is from the following point of view:Firstly, after reviewing of the background of data mining, the main process of data mining, the subjects related, the characters of the data be used, and the theory of data mining, the paper targets the research at the methods of preliminary data preparation for the data mining of ancient prescriptions.Secondly, by retracing the methods which be used in the researching of Chinese medicine, the paper find the methods feature of data mining. It is hard to study the message from the ancient Chinese medical literature by statistics method. It is surely limited by human conditions if we use documentary method. So data mining is most valuable in the study of the ancient Chinese medical literature.Thirdly, by reviewing both the areas of Chinese medicine used data mining and the data preparation for the ancient prescription data mining, it is believed that the theory of Chinese medicine should be adapted into the progress of the data mining to find ?new? knowledge in Traditional Chinese Medicine. And there are lots of problems need solutions during the data preparation courses.Fourthly, the purpose of ancient prescription data mining is to find out the relationships between drugs and diseases, between drugs, dose-effect relationships, and so on, which are never been point out clearly by anyone. There are special situations during the mining course, mainly refers to the uneasy standardizing of the data, the complex relationships of the components in one prescription, the data of diseases and drugs should be completed from the information of other resources, the data of ancient prescription are not the daily transaction style as usually data mining used. Due to the purpose and the special situations, the Chinese medicine knowledge and documentary method should be added into the preliminary data preparation course. Fifthly, thinking about the problems in the preliminary data preparation course, the paper lists the views, conclusions, or the results below:1. There are differences in the ancient prescription documents belonging to different styles or times, which demand us to choose and use them accordingly. For the purpose of saving the diverse information about prescriptions in these documents, the database should be designed based on the contents.2. The minimum conditions to form a record of prescription are disease, drugs and correspondence between them. And any changes in diseases, drugs, dose, drug preparation, dosage form, and so on, can make it necessary to derive a new record.3. Since there are diverse disease description methods in different documents, the database should be designed to keep different kinds of information.4. The disease description words in ancient documents should be divided to separate disease units out. And the disease units should be standardized with appropriate manner. The traditional documentary methods should be valued during the course, and the information of disease unit weighting in the whole disease and the information of treatment should be properly handled.5. The standardization of drug name must be carried on after the database?s accomplishment. Based on the standard drug name list, making sure of the drug’s standard name, the name should be marked anew.6. If a prescription be used as a drug in another prescription, the first prescription should be divided into drugs, and join the later prescription by its drug form.7. Information from herbal literature should be saved in drug database after the drug name standardization. The prescription database and the drug database can be seamlessly jointed by the shared standard drug names. The drug database should save the information of the drug?s indications, nature of cold or hot, tastes, toxicity, and relationships with other drugs, at the same time, the information source can be traced.8. The information of drug preparation should be saved and standardized too.9. Some of the names of dosage form in ancient prescription have different meanings with nowadays, for example, the“powder”made by concentrating, the Dan by chemical reaction, the“Wine Agent”brewed from drugs. These dosage forms should have special new names. 10. The liquid for decoction, or taking pill and powder, the Yao Yinzi of decoction, the binder of pill, the base of ointment, and so on are special materials. Their attributes should be marked out.11. In practice, some drug could be in a form different with the dosage form of its prescription, so, the dosage forms of drugs should be saved independently.12. To facilitate data processing, the paper divided the changing progress of the Chinese medicine use metrology system into three phases, Han and Tang period, Song and Jin period, and Yuan, Ming and Qing period.13. To explain the conversion relationship between the units of the Chinese medicine used metrology system in history and international metrology system nowadays, the paper makes reference tables for data processing.14. After analyzing the meanings of Qian, Qianbi, Zi, and Zibi, the paper ascertains that one Fangcunbi in the times of Tao Hongjing equals 5ml by different methods at the first time. And the volumes of some natural objects and Qianbi, Qianwubi, etc. are figured out too.15. After analyzing Fen, the unit of weight in ancient prescription documents, by its changes of meaning and value, the paper proved the medicine use weight unit Fen?s meaning change from ?big Fen in Zhu system? to ?Fen in Qian system? happened in Yuan period, not in Song period as History of Song mentioned.16. In the process of ancient prescription data mining, it is an inevitable demand to turn all the non-weight units into weight units. But it?s not proper to do so when it is hard to define the ancient drugs? quality and standards. ?Dengfen? could be treated according to Tao Hongjing?s explanation.17. For the ancient prescription data mining, different drug doses should be in one standard form. The form should be daily dose or each dose figured out according to the original document, and expressed in grams.18. Based on the analysis of problems above, the database tables and the Entity Relationship Diagram for the ancient prescription data mining are designed.Sixthly, the elements relationships inside one prescription record are studied. On this basis, a series of problems are investigated, including determining a drug?s direct treatment function to certain disease, dose-effect relationship, detecting customary combination of drug, and identifying the principal, the assistant, the complement, and the guide in one prescription.Conclusion:1. The ancient prescription data mining means, a whole process of standing on Chinese medicine knowledge, using data mining technology, collecting and integrating prescription data from ancient Chinese medicine documents, mining the data for new Chinese medicine knowledge, and expressing the new knowledge in Chinese medicine language.2. The ancient prescription data have the following characteristics: a. They are from diverse sources. b. They are example data. c. Some of their attribute data are commonly missing. d. The original words which they use are commonly not standard. e. There are complex relationships between the data from same attribute or different attributes.3. The ancient prescription data processing should follow the following principles: a. Ensure that the original words in documents can be found via data records correspondingly. b. Once there is any change among the attributes of a prescription record, the possibility should be considered to build a new record. c. Wholly standardize the non-standard terms used in related attributes. d. There is necessity to make full use of known knowledge on Chinese medicine, and to value the involvement of relevant disciplines.4. The terms of some attributes should be wholly standardized. Disease terms, symptoms terms, pathogenesis terms, treat terms, and drug indications terms should be treated together. Drug name should be standardized mainly from the prescriptions, and unified drug names should be used in different tables. The standardizing working should be carried on among the terms of drug preparation, special use of drug, document information, so on. The common standardizing working steps are below: a. Collect all the relevant terms. b. Analysis the meaning items of all the terms. c. Merge same meaning items and give the standard name. d. Mark the standard name by studying the original term, meaning item and standard name.5. Ancient prescription dose data processing:The standard ancient prescription dose form should be daily dose and each dose expressed in grams, with certain condition of dosage form, preparation, and method of use.The paper issues tables of conversion relationship between the units of the Chinese medicine used metrology system and International metrology system for data processing.For the very first time, the paper ascertains that one Fangcubi in the times of Bencaojingjizhu equals 5ml by different methods, and accordingly systematically obtained the volume of Qianbi, Zibi, Fangcunbi, etc. in different times.For the first time, the paper ascertains the values of the weight unit Fen in different times in gram.The paper believes that it?s not proper to turn all the non-weight units into weight units at present time. The temporary processing method of the non-weight units should be kept the original words but no other way.6. The paper believes that there are some kinds of possibility for the relationships between diseases, between drugs, and between disease and drugs in one prescription record. The direct treatment function of one drug to one disease could be supported only after the elimination of other kinds of relationships. Besides data mining, we can affirm the function by the prescription data with one drug, drug addition or subtraction related to certain symptom, and the drug treatment function data.The paper also believes that during the course of ancient prescription data mining, determining dose-effect relationship, detecting customary combination of drug, and discovering the identifying patterns of the principal, the assistant, the complement, and the guide components, are all based on the works of preparing the data reasonably and ascertaining the key relationships above.7. As an inevitable link of ancient prescription data mining, the preliminary data preparation work shoud be accurate and reasonable. And the work also needs not only systematic standardization based on original documents as possible, but also the consideration of the relationships between prescription elements.In this process, there are some common problems need attentions, such as: the use of Chinese medicine theory or conclusion, standardization of the various properties, attention to analyzing and processing all the details. It is also important to consider other traditional methods, such as: traditional Chinese medicine way of thinking, textual research, cultural study, the logical method, mathematical methods, measuring methods and experimental methods.As to the preliminary data preparation for ancient prescription data mining, it is necessary to combine rigorous treatment of various details with specific mining purpose. Both accurate and reliable data and operability should all be considered. For this is a relatively detailed study, some aspects can be appropriately simplified.

  • 【分类号】R289
  • 【被引频次】4
  • 【下载频次】818
节点文献中: 

本文链接的文献网络图示:

本文的引文网络