节点文献

基于非参数回归的短时交通流量预测方法研究

A Study on Short-term Traffic Volume Forecasting Based on Non-Parametric Regression

【作者】 张晓利

【导师】 贺国光;

【作者基本信息】 天津大学 , 管理科学与工程, 2007, 博士

【摘要】 短时交通流量预测是智能交通系统(ITS)的关键技术之一,其预测性能的好坏、是否满足实时性要求都直接关系到交通控制与诱导系统的有效实现。本文从分析短时交通流特性入手,从归纳-演绎、非线性时变系统两个角度认识非参数回归方法,从原理上阐明应用非参数回归方法进行短时流量预测的适用性。讨论了应用非参数回归方法的关键步骤和影响因素。非参数回归方法作为一种新型的智能方法,仍然存在诸多缺点限制了它的实际应用。这些缺点集中在:样本数据库结构不合理、搜索策略效率不高、系统开环等。本文从研究这些缺陷入手,对该方法本身进行多方面的改进,使其提高预测准确度和满足实时性要求。主要的改进包括:(1)将原始流量数据和搜索数据分别存放,建立基于一维和多维数据搜索的数据库结构和搜索策略。平衡二叉树和R树的逻辑结构和静态链表的物理结构的应用大幅度地缩减了数据搜索所需时间,提高了预测的实时性。(2)将闭环反馈回路加入到预测系统中最关键的步骤――模式匹配中,通过预测误差来修正模式匹配结果,从而使模式匹配过程更加合理,提高了预测的准确度。(3)分析影响非参数回归预测鲁棒性的因素,重点在于系统需要重建时,针对大量原始数据的收集和实时预测这一对矛盾,提出应用系数库和分批预测的思想加以解决。由于原始流量数据不具备非参数回归方法所需的中心点和中心点附近的K个近邻点,同时考虑到原始流量数据具有维数高、冗余量大的特点,因此有必要对原始数据进行数据预处理操作。在本文中,采用主成分分析达到降维和消除变量之间相关性的目的。采用聚类分析剔除冗余数据,并且得到数据中心点和近邻点。应用交通仿真软件对典型路网结构进行仿真,得到在各种仿真条件下的流量数据。对于创建数据库所需的数据,也就是仿真过的路网结构的流量模式,采用较大跨度的参数设置,得到各种流量模式的边缘状态;而对于检验过程所需的数据,采用较小跨度的参数设置,这样更有利于研究模式的演变状态。重点对于数据库的两种数据结构在预测准确度和预测所需时间上进行比较。结果表明,一维搜索结果优于多维结构。

【Abstract】 Short-term traffic volume forecasting is one of key technologies of the Intelligent Transport Systems (ITS). Perfect performance of forecasting and meeting real-time requirement concern the effective realizations of traffic control and transportation induction system.Based on the analysis of the properties of traffic flows, this dissertation begins to recognize Non-Parametric Regression (NPR) from two different angles: deductive-inductive method and theories about non-linear time-variant system. NPR is suitable for short-term traffic volume forecasting theoretically. The main steps and influencing factors are discussed in applying NPR.NPR as a new intelligent method has many shortcomings restricting its applications which focus on: ill-suited database structure, low searching efficiency, open loop structure etc. This dissertation improves the method to advance forecasting accuracy and meet real-time requirement.The main improvements include:(1) The original volumes and searching data are separated to be stored in two databases. The databases are based on unidimensional and multi-dimensional structures and searching strategies. The applications of balanced binary tree and R tree as logistic structures, static chain as physical structure reduce the searching time and meet the real-time requirement.(2) A closed feedback loop is added upon the most important step--pattern matching. The matching results are amended by forecasting errors to improve forecasting accuracy.(3) Analysis the factors of NPR affecting robustness and focus on the contradiction between the collection of original data and real-time forecasting when rebuilding the system. The coefficient database and the idea of batch forecasting are put forward to solve the problem.NPR needs the data centers and K nearest neighbors round every center. But the original data does not possess these. Furthermore, it has features of high- dimensions and large superfluous data. So the pretreatment operations to original data are necessary. In this dissertation principal component analysis is adopted to bring down input variable dimensions and eliminate the relativities among them. Superfluous data is rejected by cluster analysis. Typical road network is simulated by traffic simulation software and the volumes data is gotten by all kind of simulation conditions. To the data for creating database, represented by volume modes with which the network has been simulated, the simulation parameters are set by large spans to get the marginal states of volumes. To the data for testing, the simulation parameters are set by small spans to study the mode evolutions. The experiments are focused on the comparisons of the two data structures at the respects of forecasting accuracy and time consumption. The comparison results show that the unidimensional structure is superior to multidimensional one.

  • 【网络出版投稿人】 天津大学
  • 【网络出版年期】2009年 08期
  • 【分类号】U491.113
  • 【被引频次】18
  • 【下载频次】1072
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络