节点文献

基于Web使用挖掘的用户模式识别研究

Research on Users Patterns Discovery Based on Web Usage Mining

【作者】 覃拥军

【导师】 刘先锋;

【作者基本信息】 湖南师范大学 , 计算机应用技术, 2008, 硕士

【摘要】 数据挖掘是近年来随着数据库技术和人工智能技术的发展而出现的一种全新信息技术,也是计算机科学与技术,尤其是计算机网络的发展和普遍应用所提出的迫切需要解决的重要课题。数据挖掘是从大量数据库中发现人们感兴趣的、隐藏的、先前未知的知识。数据挖掘技术主要研究结构化的数据挖掘,而Web数据挖掘是应用于WWW的技术研究,是从半结构或无结构的Web页面中抽取令人感兴趣的、潜在的模式。Web服务器日志记录具有良好的结构,非常有利于进行数据挖掘。Web使用挖掘是Web挖掘中三个研究领域中非常重要的一个研究方向,通过分析和探索Web日志记录中的规律,可以识别电子商务中的潜在客户,增强对用户的网络服务质量,并改进Web服务器系统的性能。本文在基于聚类的基础上讨论了Web使用挖掘中的各种问题。首先系统地阐述了从数据挖掘、Web数据挖掘到Web日志挖掘整个过程。通过对基于Web日志的数据挖掘的讨论,说明如何进行Web日志挖掘以及在Web日志挖掘中应采取的数据挖掘技术。然后从理论的角度对聚类进行较为全面的探讨,分析了聚类的概念,常见的聚类方法和常见的聚类的算法。在Web使用挖掘的模式识别阶段,本文对BIRCH算法改进,将改进的算法应用于Web用户模式识别中,验证了算法的有效性。

【Abstract】 Data mining is a new information technology which appeared with the development of the database technology and artificial intelligence technology in recent years. Also it is an important subject which was proposed by the development and application of computer science and technology, especially by the development of computer network, and it should be solved urgently.Data mining is used to discover the interesting, hidden and unknown knowledge from mass data. And it mainly deals with the structural data, while web data mining is based on WWW, which gets the interesting and potential pattern from the semi-structural or non-structural web pages. The log files of web server with a nice structure will be convenient for data mining. Web usage mining is one of the most important research fields in web mining. It could find out the potential customers of e-commerce and enhance the quality of web service by analyzing and exploring the rules of web logs. Moreover, it could improve the performance of the web server.In this thesis, we discuss different questions of Web Usage Mining based on clustering. Firstly, it introduces the development from data mining and web data mining to web log mining. By discussing data mining based on web log, it shows how to process the web log mining and which data mining technology should be taken in web log mining. Then, we discuss the clustering technology in depth, and analyze the concept of clustering, the familiar clustering methods and algorithms. During pattern discovery phase of Web Usage Mining, the thesis presents an ameliorated solution on traditional BIRCH algorithm. And then the improved algorithm is used in users patterns discovery to prove the validity of the arithmetic.

  • 【分类号】TP311.13
  • 【被引频次】4
  • 【下载频次】186
节点文献中: 

本文链接的文献网络图示:

本文的引文网络