节点文献

基于WEB日志的数据挖掘

Web Log Based Data Mining

【作者】 王春霞

【导师】 范明;

【作者基本信息】 郑州大学 , 计算机软件与理论, 2003, 硕士

【摘要】 数据挖掘是数据库最活跃的领域之一。由于其广泛的应用背景和现实意义,数据挖掘技术的研究和应用都获得了突飞猛进的发展,在国内外的学术界和信息产业界备受关注。 数据挖掘是从大量数据中发现人们感兴趣的、隐藏的、先前未知的知识。数据挖掘技术主要研究结构化的数据挖掘,而Web数据的挖掘是应用于Internet的技术研究,是从半结构或无结构的Web页面中,抽取感兴趣的、潜在的模式。尽管Internet是一个半结构化的系统,很难对它进行处理,但是Web服务器日志记录具有良好的结构,非常有利于数据挖掘的进行。此外,Web日志挖掘是Web使用挖掘的一个分支,它作为Web挖掘的一个重要组成部分,具有独特的理论和实践意义。 本文系统地阐述了从数据挖掘、Web数据挖掘到Web日志挖掘整个过程,重点讨论Web日志的挖掘上。通过对基于Web日志的数据挖掘的讨论,说明如何进行Web日志挖掘及在Web日志挖掘中应采取的数据挖掘技术;然后将Web日志挖掘技术应用到商丘信息港网站,对其Web服务器的日志记录进行挖掘,建立一个Web日志挖掘系统。网络管理人员可以根据Web日志的分析结果改进网站的设计,实现网站的有效管理,保证网络的安全。最后对本文进行总结,并提出进一步的研究方向和将要做的工作。

【Abstract】 One of the most important fields in database is Data mining. In view of its wide application and practical significance, the technique and application of data mining developed rapidly and attracted much more attention both in fields of academic research and information industry.Discovering the interested, hidden and unknown data from large data sets is the purpose of data mining. The main work of data mining is to deal with the structural data, whi le the web data mining is based on Internet to get the interesting and potential pattern from the half structural or not structural web pages. Data in Internet is a half structural system, and it is difficult to deal with them. Fortunately, the web sever log files have a nice structure and it is very convenient for data mining. Furthermore, web log mining is a branch of web usage mining and has special theory and practice significance as an important part of web mining.In this thesis, the process of data mining, web data mining and web log mining was reported. Focusing on the web log mining, the method and technology of web log mining were discussed in this thesis. Finally, the technology of web log mining wasapplied to shangqiu information web station (http://www. sqinfo. ha. en). Through the mining of its web sever log files, a data mining system based on web log mining was estabiish. The estabiished data mining system will faci Iitate the station management, the improvement of the design of web station and the security of network. At the end, the future direction and works in web log mining were proposed.

【关键词】 数据挖掘web数据挖掘web日志挖掘cookie
【Key words】 Data MiningWeb Data MiningWeb Log MiningCookie
  • 【网络出版投稿人】 郑州大学
  • 【网络出版年期】2004年 04期
  • 【分类号】TP393.09
  • 【被引频次】2
  • 【下载频次】409
节点文献中: