节点文献

基于WUM的个性化智能推荐技术研究

【作者】 周宇

【导师】 张森;

【作者基本信息】 浙江工业大学 , 计算机应用技术, 2003, 硕士

【摘要】 随着Interent的迅速发展和WWW(world wide web)技术日渐成熟并向社会生活各方面渗透,可利用的信息资源的数量越来越大,类型越来越多,人类交互信息也不可避免地电子化和海量化。巨量的、无组织的信息,以及Interent上信息资源分布的广泛性,给用户寻找感兴趣的信息增加了困难,用户不知道如何更有效地发现自己所需的信息资源。而且,现有的信息发布和搜索引擎,由于其固有的缺点,无法有效地解决这两类问题。 传统的数据挖掘技术和WEB相结合衍生的WEB挖掘技术为有效解决这一问题开辟了崭新的途径。本文尝试利用WEB挖掘技术对海量的WEB访问日志数据进行深入地分析和研究,挖掘出用户的个性化访问事务模式,并在此基础上对用户进行智能地信息推荐,达到个性化主动信息服务的目的。所做的工作主要包括以下几个方面: (1) 分析了数据挖掘技术的产生原因和发展背景,介绍了当前国内外数据挖掘技术研究的现状。 (2) 对WEB数据挖掘体系结构进行了深入的分析和研究,综述了WEB数据挖掘,给出了相关的定义和分类,并就WEB日志和半结构化数据的挖掘技术进行详细地探讨,描述了WEB日志数据挖掘的一般过程。 (3) 讨论了WEB使用记录挖掘的预处理方法的一般流程及相关定义。提出了基于引用时长的事务模式识别方法、基于最大前向引用的事务模式识别方法和基于时间窗的事务模式方法。 (4) 讨论了两种用户事务模式的聚类方法,即基于最大前向访问路径导航-内容事务模式的聚类方法和基于内容事务模式的聚类方法,并分别提出了基于结构系数的用户事务之间的相似度计算方法和基于共同祖先、子孙相似系数的相似度计算方法。试验结果显示。基于最大前向访问路径导航-内容事务模式的聚类将访问路径相似的用户事务模式聚类到一起,因此,比较适合在线个性化推荐服务。而基于内容事务模式的聚类方法则较适合关联性强的WEB页的聚类分析。 (5) 研究了基于WEB使用模式挖掘的在线个性化智能信息推荐服务,分为在线部分和离线部分。离线部分主要完成从站点服务浙江工业大学硕士论文器的访问109文件中挖掘出适合在线智能个性化推荐服务的用户事务模式,分别采用了基于关联规则挖掘方法和聚类用户事务方法获取用户个性化模式。在线部分,实现基于关联规则挖掘的个性化智能推荐服务和基于URL聚类模式的个性化智能推荐服务。本文对这两种智能推荐方法进行了分析、比较,总结了它们的优缺点。实验结果显示,该智能推荐系统是可行和有效的。

【Abstract】 With the fast-growing Internet and the maturation of WWW (world wide web), applications based on this technology are entering into every aspects of our society, the amount of the information which can be made use of become more and more larger , either to the type of it. Inevitably the transaction information of humankind is being electrified. It is difficult for the user to search out the needed information because of the inorganization and largeness of the information and the universality of the recource in Internet. Further more, the information access and search engine can not resolve these problems efficiency for their inhere defect.The amalgamation of the data mining and WEB offer a new way to resolve the problem. This paper try to made in-depth analysis and research on the WEB logs data by WEB data mining resulting in a user’ s transaction pattern, and achieve the intelligent services of personalization recommendation. The contents of this dissertation are as follows:(1) We review the origin and background of data mining technology; introduce current status of international and domestic research on data mining.(2) We made in-depth analysis and research on the systematic structure of WEB date mining, gave outline of WEB date mining, definition and category of WEB date mining, and described general process of data mining for WEB logs.(3) To introduce the general structure and definition of the data preprocessing phase of WEB logs mining. The transaction identification based on reference length >maximal forward reference and time windows are proposedrespectively .(4) To discuss the clustering methods for two user transaction patterns that are user’ s navigation-content transaction based on maximal forward reference and the user’ s content-only transaction respectively. In the former, the similarity measures between user’ s transaction patterns attempt to incorporate with the structures of WEBsite and the URLs involved . In the latter , the similarity measures use direct paths, the common ancestors and the common descendants to clustering user’ s transaction patterns for the online personalized intelligent recommendation services.(5) To propose a intelligent service method on personalized recommendation based on user’ s transaction patterns and user’ s current navigational activity, the overall process of which can be divided into two parts: offline part and online part. In offline, WEB mining tasks can execute in the logs of WEB service resulting in a user’ s transaction pattern file. In online, the candidate URLs for recommendation can be determined by matching association rules in the aggregating tree or URL clusters with the current active session for the intelligent services of personalization recommendation. The advantage and shortcoming of each in two methods are discussed. The experiments demonstrate that our approach is applicable and effective.

  • 【分类号】TP393.09
  • 【被引频次】2
  • 【下载频次】127
节点文献中: 

本文链接的文献网络图示:

本文的引文网络