节点文献

基于数据挖掘的道路交通事故分析研究

Traffic Accident Analysis Based on Data Mining

【作者】 孙轶轩

【导师】 邵春福;

【作者基本信息】 北京交通大学 , 交通运输规划与管理, 2014, 博士

【摘要】 “十二五”时期,随着我国经济社会发展保持较高增长速度,城镇化深入发展,道路交通基础设施日益完善,机动车保有量、驾驶人数量和道路交通流量持续增长,道路交通在支撑和引导经济社会发展方面的作用越来越明显。随之而来,道路交通安全问题已日益发展成为事关人民生命财产安全、影响和制约经济社会发展质量效益的关键性问题,在国家安全战略高度得到了关注和重视。道路交通事故是人、车、道路、环境等动静态因素耦合失调而导致的人或物同时受损的过程。道路交通事故历史数据可直接反映事故发生时人、车、道路、环境等因素间的作用关系。鉴于道路交通事故发生的多因素性、偶然性和模糊性等特征,对其进行分析研究一般以道路交通事故历史数据为研究对象,相关理论和方法的提出,旨在多角度、多层次的分析道路交通事故的影响因素,揭示出各类事故历史数据间相互关联作用的潜在规律与特征,有效辅助交通安全管理及事故防治。数据挖掘技术是从大量数据中挖掘隐含的、未知的、对决策具有潜在价值的概念、规则、规律、模式的数据分析方法。把道路交通事故历史数据作为数据挖掘对象,对道路交通事故进行分析研究,重点和难点在于:一方面,事故历史数据多用于对“事故起数”、“受伤人数”、“死亡人数”和“财产损失”四项指标的描述性统计,其潜在信息价值未得到充分挖掘和反映;另一方面,事故历史数据的离散性、多维度和模糊因素集合等特征,以及信息采集过程中存在的完整性、客观性以及标准化等方面的问题,导致事故历史数据挖掘存在各种应用局限,进而直接影响传统数据分析理论与方法的应用效果。本论文针对我国道路交通事故信息采集数据的特点及数据分析应用中的关键问题,从事故严重程度分析、事故预测和事故致因分析三个方面,运用分类、回归、聚类分析、关联规则挖掘等数据挖掘相关理论与方法,构建基于数据挖掘的道路交通事故分析体系,深入探究道路交通事故与人、车、道路、环境等要素的作用关系。取得如下主要研究成果:(1)以道路交通事故信息采集数据为研究对象,采用数据挖掘相关理论和方法,构建道路交通事故分析体系,为揭示交通事故影响因素及作用规律、预测事故发展趋势、改善事故预防机制和提高道路交通系统安全水平提供数据基础和理论依据。(2)在对道路交通事故的人、车、道路、环境等背景因素分布特征与影响机理充分认识的基础上,比较研究各国道路交通事故信息采集技术和数据特征的异同,重点分析我国现行道路交通事故信息采集领域,特别是事故信息数据结构的现状和特点,为执行数据挖掘准备奠定了基础。(3)引入数据挖掘理论的分类思想进行事故严重程度分析研究,按照二分类和多分类方法分别构建线性和非线性事故严重程度TPMSVM分类模型。同时,建立基于特征选择的事故严重程度背景因素分析方法,依据各特征变量对模型分类效果贡献程度的重要性排序,挖掘得到影响事故严重程度的核心特征变量。在实证研究环节,通过特征选择、参数寻优算法,分别获得交叉验证条件下的线性和非线性分类最优精度和特征变量重要度排序。(4)提出基于ARIMA和SVR的时间序列组合预测模型,实现对事故四项指标的时点预测。同时,为获取道路交通事故的总体变化趋势和变化空间的预测信息,进一步研究基于信息粒化SVR的趋势预测模型,通过构造三角模糊粒子并通过SVR模型实现了对事故四项指标序列的趋势和范围预测,并进行实证研究。(5)基于道路交通事故的微观特征分布进行事故致因分析,分别构建基于两步BIRCH算法的严重事故特征聚类分析模型和基于决策树的事故原因识别模型,实现对道路交通事故致因分析的微观挖掘。

【Abstract】 During the period of the12th Five-Year Plan, along with China’s continued rapid socioeconomic development, the in-depth urbanization development, the increasingly improved infrastructure of road traffic and constant growth of vehicle population, driver population and traffic flow, road traffic has been playing an increasingly apparent role in guaranteeing and promoting economic and social development. Subsequently, road traffic safety has become the key issue relating to people’s lives and property safety, influencing and restricting the quality and efficiency of economic and social development, being highly concerned and focused in national security strategy.Road traffic accident is a process of damage to people or property due to the disordered coupling of people, vehicle, road, environment and other dynamic or static factors. The historical data of road traffic accident can immediately reflect the interaction relationship among people, vehicle, road and environment when the accident occurred. Due to the features of the occurrence of road traffic accidents, including multifactor, contingency and ambiguity, the analysis research on this topic usually chooses the historical data of road traffic accident as the research objective, proposes related theories and research methods, aiming at analyzing the influential factors of road traffic accidents from multiple aspects and levels in order to reveal the potential rule and features of the correlations among the historical data of various accidents and effectively support traffic safety management and accident prevention and treatment.Data mining is a kind of data analysis method, aiming at digging out implicitly unknown concepts, rules, regulations and modes with potential value to decision making. The important and difficult points of putting the historical data of road traffic accident as the object of data mining to conduct an analysis research on road traffic accident are as follows:on the one hand, accident historical data are usually used for the statistical description of four indicators:frequency, injury, death and property loss, but failing to fully dig out and reflect the potential information value; on the other hand, due to the discreteness, multi-dimension and fuzzy factor integration of accident historical data and the integrity, objectivity and standardization problems during information collection process, the application of accident historical data mining is very limited, further directly influencing the application effect of classic data analysis theories and methods. Under the aforementioned background, this paper, based on the features of China’s data collection of road traffic accident and the key problems in data analysis application, constructs an analysis system of road traffic accident based on data mining, by applying data classification, regression, clustering analysis, association rules mining and other theories and methods related to data mining in three dimensions, namely accident severity analysis, accident prediction and causation analysis in order to deeply explore the interaction relationship between road traffic accident and people, vehicle, road, environment and other factors. This paper achieves the following results.1. This paper sets road traffic accident information collection data as its research object, applies theories and algorithms related to data mining, proposes the analysis system of road traffic accident, providing with data foundation and theoretical basis for revealing influential factors and laws of function of traffic accidents, predicting traffic accident trends, constructing prevention mechanism of traffic accident and improving the security level of the overall road traffic system.2. Based on the full understanding of the distribution feature of background factors and influential mechanism of road traffic accidents, this paper conducts an analysis research on the similarities and differences of data collection technologies and data features in various countries, and focuses on analyzing China’s current information collection field of road traffic accidents, especially the current status and features of accident information structure, to lay a foundation for the implementation of data mining.3. This paper introduces the classification method of data mining theory into the analysis research of accident severity, to construct linear/nonlinear TPMSVM disaggregated models respectively according to binary classification and multiple classification methods. Meanwhile, this paper proposes the background factor analysis method of accident severity based on feature selection, with which operator can sort data according to the contributions of various characteristic variable to the effect of model classification respectively, in order to dig out the core characteristic variables that influences the severity of accidents. In the linkage of empirical research, this paper obtains the linear/nonlinear optimal classification accuracy and ranking of feature variable importance under the condition of cross validation respectively, through feature selection and parameter optimization algorithm.4. Aiming at the prediction of road traffic accident trend, this paper proposes the prediction of time series of traffic accident four indicators by integrating ARIMA and SVR model as an combinative model, achieve the purpose of time point prediction. Meanwhile, in order to obtain the information of the overall change trend and change space of road traffic accidents, this paper further proposes, based on the trend prediction model of information granulation SVR, to realize the prediction of the trend and scope of traffic accident four indicators through constructing triangular fuzzy particles and related SVR modeling.5. Based on the distributional features of traffic accident attributes, this paper respectively constructs the clustering analysis model based on two-step BIRCH and the pattern recognition model of accident causation based on Decision Tree, to realize the micro digging of severe traffic accident causation analysis.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络